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Amendments to the Drawings: 

The drawings were objected to for legibility of the letters in darkened boxes the figures. 
Applicants have submitted herewith, replacement Figures 1 A- IE in which the darkened boxes 
have been removed to improve legibility and clarity. Accordingly, the objection to the drawings 
should be withdrawn. 
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REMARKS 

Status of the Claims 

Claims 1-23 were rejected. Claims 12-18, 20 and 21 have been canceled without 
prejudice or disclaimer. Applicant reserves the right to pursue subject matter of claims 12-18, 20 
and 21 in a continuation or divisional application. Claims 1-11,19, and 22-23 are pending. 

Claims 1, 2, 3, 1 1, 19, 22, and 23 have been amended. To expedite prosecution, claim 1 
has been amended without prejudice or disclaimer to delete the subject matter drawn to a 
complement of the nucleic acid sequences of (a) through (d). Applicant reserves the right to 
pursue deleted subject matter of claim 1 in a continuation or divisional application. Claims 2, 3, 
11, 19, 22 and 23 have been amended to more clearly define the scope of the invention. No new 
matter has been entered by way of these amendments. 

The Objection to the Claims Should Be Withdrawn 

The Examiner has objected to Claim 1 1 for improper article usage. Claim 1 1 has been 
amended to recite "the plant" and is, therefore, in proper format. Claim 2 was objected to for 
failing to limit the subject matter of a previous claim from which it depends. Claim 2 has been 
amended to recite "the isolated" and, as such, properly limits the scope of claim 2. Accordingly, 
it is respectfully requested that the objection to claims 2 and 1 1 be withdrawn. 

The Rejections Under 35 U.S.C. § 112, First Paragraph, Should be Withdrawn 
Enablement 

The Examiner rejected claims 1-11, 19 and 22-23 under 35 U.S.C. § 1 12, first paragraph, 
on the grounds that the specification does not enable one skilled in the art to make or use the 
invention. This rejection is respectfully traversed. 

The Examiner asserts that the specification, while enabling for nucleic acids encoding 
SEQ ID NO:2 or 4, host cells, plants, plant cells and seeds comprising them, and method of 
using them to make SEQ ID NO:2 or 4, does not reasonably provide enablement for methods and 
compositions drawn to nucleic acids encoding pesticidal protein with 95% sequence identity to 
SEQ ID NO:2 or 4, nucleic acids with 95% identity to SEQ ID NO:l or 3, or a complement of 
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those nucleic acids, host cells, plants, plant cells and seeds comprising them, and method of 
using them to make a pesticidal protein with 95% identity to SEQ ID NO: 1 or 3. The Examiner 
states that the specification fails to provide guidance for which amino acids of SEQ ID NO:2 or 4 
can be altered and to which other amino acids, and which amino acids must not be changed, to 
maintain the activity of the encoded protein, as well as which regions of the protein can tolerate 
insertions and still produce a functional protein. 

The Examiner appears to be suggesting that in order to satisfy the enablement 
requirement Applicants must demonstrate that every pesticidal polypeptide and variant and 
fragment thereof encompassed by the claims could be used to successfully practice the invention, 
such that no experimentation would be required. According to the applicable case law, however, 
the test of enablement is not whether experimentation is necessary to make and use an invention, 
but rather if experimentation is necessary, whether it is undue. In re Angstadt, 198 USPQ 214, 
219 (C.C.P.A. 1976). Furthermore, a considerable amount of experimentation is permissible if it 
is merely routine or if the specification provides a reasonable amount of guidance in which the 
experimentation should proceed. In re Wands, 8 USPQ2d 1400 (Fed. Cir. 1988). 

The test of whether an invention requires undue experimentation is not based on a single 
factor, but rather is a conclusion reached by weighing many factors. Id. at 1404. Factors to be 
considered in determining whether undue experimentation is required include the quantity of 
experimentation necessary, the amount of guidance provided in the specification, the presence of 
working examples of the invention in the application, the nature of the invention, the state of the 
prior art, the relative skill of those in the art, the predictability in the art, and the breadth of the 
claimed invention. Id. Accordingly, the holding of Wands does not require that Applicants 
provide as working examples every pesticidal polypeptide that could be used to practice the 
present invention. Rather, Wands sets out factors to be considered in determining whether undue 
experimentation is required to make and use the invention. 

The Examiner argues that the specification does not enable one of skill in the art to make 
and use nucleic acids that encode polypeptides that retain pesticidal activity and have at least 
95% sequence identity to SEQ ID NO:l or 3, or 95% sequence identity to a nucleotide sequence 
that encodes SEQ ID NO:2 or 4. The Examiner incorrectly bases this conclusion solely on the 
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number of possible nucleic acids having the recited percent identity to SEQ ID NO: 1 or 3, or a 
nucleotide sequence encoding SEQ ID NO:2 or 4 while ignoring the other factors set forth in 
Wands for assessing whether undue experimentation is required. In particular, the Examiner has 
improperly discounted the guidance provided in the specification and the working examples set 
forth in the application (page 4 of the Office Action mailed February 14, 2006). 

First, sufficient guidance for making and using the recited sequences is present in the 
specification. The claimed variants and fragments of SEQ ID NO: 1 or 3, or nucleotide 
sequences encoding SEQ ID NO:2 or 4 are limited by a percent identity (i.e., 95% identity) and 
further limited by the functional requirement that they possess pesticidal activity. Guidance for 
preparing variants and fragments of SEQ ID NO: 1 or 3, or nucleotide sequences encoding SEQ 
ID NO:2 or 4 and for determining percent identity is provided in the specification and generally 
known in the art. See page 8, lines 20-29, and pages 9-13. Numerous delta-endotoxins were 
also well known in the art at the time the application was filed. See Crickmore et al (1998) 
Microbiol Molec. Biol Rev, 62:807-813, which is incorporated by reference on page 2, lines 7-8 
and is submitted herewith as Appendix A, and Crickmore et al (2004) Bacillus thuringiensis 
Toxin Nomenclature at lifesci.sussex.ac.uk/Home/Neil_Crickmore/Bt. The necessary molecular 
biology and mutagenesis techniques for preparing the variants and fragments of pesticidal 
sequences of the invention are routine. Moreover, methods for assessing the pesticidal activity 
of a polypeptide are readily available in the art and provided in the specification. See, for 
example, page 8, lines 25-29 and Examples 8 and 10. 

In order to identify the pesticidal sequences encompassed by the present claims, one of 
skill in the art would only need to prepare variants and fragments of the nucleotide sequence of 
SEQ ID NO:l or 3, or a nucleotide sequence encoding SEQ ID NO:2 or 4, having the specified 
characteristics recited in the claims (e.g., at least 95% identity) and then assay these polypeptides 
for pesticidal activity. Routine methods for preparing variants and fragments and testing the 
resulting polypeptides for pesticidal activity are routine in the art and described in the 
specification. Although some experimentation is required to practice the claimed invention, it is 
now customary in the art to generate a large number of sequences and to test them in a large- 
scale assay for a desired function, and, therefore, such experimentation is not undue, particularly 
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in view of the routine nature of the required methods. Contrary to the Examiner's conclusions, 
in order to identify variants and fragments of the nucleotide sequence of SEQ ID NO:l or 3, or a 
nucleotide sequence encoding SEQ ID NO: 2 or 4 that could be used in the invention, a person 
skilled in the art would only need to utilize standard molecular biology and mutagenesis 
techniques and routine screening tests for pesticidal activity. Therefore, given the level of skill 
and knowledge in the art, the availability of standard methods and assays, and the significant 
guidance provided in the specification, Applicants respectfully submit that the amount of 
experimentation required to identify delta-endotoxins and variants and fragments thereof having 
pesticidal activity and the structural features recited in the claims is routine, not undue. 

The Examiner farther argues that mutation of sequences, even conservative substitutions, 
does not produce predictable results and, therefore, the specification is not enabling with respect 
to variants of the nucleotide sequence of SEQ ID NO:l or 3, or a nucleotide sequence encoding 
SEQ ID NO:2 or 4. The Office Action cites Lazar et al (1988) Molecular and Cellular Biology 
8:1247-1252 and Hill et al (1998) Biochem. Biophys. Res. Comm. 244:573-577 in support of the 
general unpredictability of the art with respect to modification of nucleotide sequences. Each 
reference, however, simply teaches that alteration of highly conserved sequences will disrupt 
function. Lazar et al teach that alterations in amino acid residues 47 and 48 in TGF-alpha can 
alter the activity of the polypeptide. Contrary to the Examiner's conclusion, the alteration in the 
polypeptide was specifically designed to occur at amino acid positions that are highly conserved 
in the EGF-like family of polypeptides. Similarly, the modified residues described by Hill et al 
were conserved among bacterial and plant ADP-glucose pyrophosphorylases. As set forth in the 
first line of the abstract, "[t]wo absolutely conserved histidines and a third highly conserved 
histidine are noted in eleven bacterial and plant ADP-glucose pyrophosphorylases" (emphasis 
added). These absolutely and highly conserved histidines were mutagenized and characterized in 
the paper. One of skill in the art would not be surprised that modification of one of these highly 
conserved amino acids would lead to the loss of function described by the authors. Applicants 
further note that the Lazar et al and the Hill et al references are directed to TGF-alpha and 
ADP-glucose pyrophosphorylase, neither of which has any relation to the pesticidal sequences of 
the present invention. Thus, the cited references do not support the Examiner's broad assertion 
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of inherent unpredictability of protein function resulting from the mutation of the underlying 
nucleotide sequence. In fact, both references support Applicants' arguments that at the time the 
application was filed one of skill in the art could modify polypeptide sequences and test the 
resulting variants for biological activity. 

Furthermore, the specification provides guidance regarding conservative modifications 
that are unlikely to disrupt biological activity. See, for example, pages 11-12. Thus, by 
reference to a standard codon table, one of skill in the art could predict which modifications 
would not affect the biological activity of the encoded polypeptide. Also, the specification 
highlights conserved residues that are not likely to tolerate substitution (see Figures 1 and 2 as 
originally filed) and delineates conserved domains characteristic of delta-endotoxin proteins (see 
page 4, lines 5-11). The replacement figures do not highlight conserved residues, however, one 
of skill in the art would understand how to use the alignment provided in the replacement figure 
to identify conserved residues using, for example, the methods described in the instant 
specification (see pages 10-11). 

Moreover, as described above, Applicants have disclosed pesticidal sequences, and 
variants and fragments thereof, and the art was replete with additional delta-endotoxin sequences 
at the time the application was filed. Information relating to conserved regions of delta- 
endotoxins may be obtained from these sequences. A person of skill in the art would appreciate 
that comparison and alignment of known delta-endotoxin sequences may reveal information 
regarding appropriate sites or regions for modifications. By aligning these sequences, one may 
be able to identify conserved residues or regions within these proteins that are unlikely to tolerate 
mutation and still retain pesticidal activity. Methods for aligning sequences, such as by using the 
CLUSTAL algorithm, are described in the specification. See pages 10-11. 

In addition, detailed information about the structure of delta-endotoxins was known in the 
art. See, for example, Li et al (1991) Nature 353:815-821 (describing the crystal structure of the 
Cry3A protein), which is incorporated by reference on page 12 of the specification, and Morse et 
al (2001) Structure 9:409-417, both of which are submitted herewith (Appendices B and C, 
respectively). Delta-endotoxins are extremely well-characterized and related to each other to 
various degrees by similarities in their amino acid sequences and tertiary structures. A combined 
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consideration of the published structural analyses of delta-endotoxins and the reported functions 
associated with particular structures, motifs, and the like indicates that specific regions of the 
toxin are correlated with particular functions and discrete steps of the mode of action of the 
protein. Thus, a rational scheme for determining the regions of a delta-endotoxin that would 
tolerate modification is provided. Based on the regions of delta-endotoxins that are conserved 
among protein family members, the skilled artisan could choose among possible modifications to 
produce polypeptides within the structural parameters set forth in the claims and then test these 
modified variants to determine if they retain pesticidal activity. In light of the guidance provided 
in the specification and the state of the art with respect to delta-endotoxins, a skilled artisan 
could readily conclude which amino acids are essential for structure and function and could 
envisage similar sequences that are 95% identical to the nucleotide sequence of SEQ ID NO:l or 
3, or a nucleotide sequence encoding SEQ ID NO: 2 or 4, and that retain pesticidal activity. As 
such, one of skill in the art could identify the pesticidal sequences encompassed by the present 
claims without undue experimentation. 

The Examiner has also cited Guo et al (2004) Proc. Natl. Acad. Sci. USA 101:9205-9210 
for the proposition that increasing the number of amino acid substitutions in a protein increases 
the probability that the protein will be functionally inactivated. The Examiner uses this reference 
as evidence that making and analyzing delta-endotoxins that have multiple amino acid 
substitutions but that still retain pesticidal activity will require undue experimentation. The 
Examiner, however, has mischaracterized the Guo et al. reference. The cited reference is 
directed to analysis of the probability that a random amino acid replacement will lead to a 
protein's functional inactivation (emphasis added). In contrast, the specification provides a 
rational and systematic method for designing delta-endotoxin variants that retain pesticidal 
activity. One of skill in the art would appreciate that regions known to be important for 
pesticidal activity would be unlikely to tolerate significant mutation and, therefore, would not 
expect such mutations to result in a biologically active protein. Thus, the teachings of Guo et al 
do not support the Examiner's conclusion that the present claims lack enablement. 

The Examiner further relies on the teachings of de Maagd et al (1999) Appl. Environ. 
Microbiol 65:4369-4374, Tounsi et al (2003)7. Appl Microbiol 95:23-28 and 
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Angsuthanasombat et al (2001) 1 Biochem. Mol Biol 34:402-407 in support of the assertion 
that amino acid substitutions in delta-endotoxin proteins are unpredictable. However, each of 
these references describes substitutions (which are largely non-conservative) in conserved 
regions, de Maagd et al teaches that the insertion of several groups of amino acids within 
Domain III of Cry IE with the corresponding amino acids of CrylC will alter the specificity 
and/or toxicity of Cry IE. Since the conserved Domain III is well known by those of skill in the 
art to be involved in specificity of a delta-endotoxin toward a pest, it would be no surprise that 
alteration of this domain could affect specificity of the protein. In fact, that was the intention of 
de Maagd et al Similarly, Tounsi et al discuss the single amino acid difference between 
Cry Hal and Crylla2 (which is a non-conservative substitution of aspartic acid for tyrosine at 
position 233) as being critical to insecticidal specificity of these two toxins. Again, this 
substitution occurs in the conserved Domain I. Finally, Angsuthanasombat et al teach a critical 
amino acid residue at position 136 where even a conservative substitution could lead to loss of 
pesticidal activity. Yet again, the authors specifically targeted amino acids in conserved Domain 
I in order to alter function. Since the instant specification clearly defines the conserved domains 
described by the aforementioned references with respect to the claimed sequences (see page 4), 
one of skill in the art would appreciate that substitutions made in these domains could lead to a 
loss of specificity and/or toxicity. Further, the references cited by the Examiner actually support 
the Applicant's assertion that one of skill in the art at the time of the invention would understand 
which residues could be altered to change the function of delta-endotoxins, implying that one of 
skill would equally understand which residues not to change when maintenance of function is 
desired. 

In establishing non-enablement, the burden rests initially with the Examiner to 
substantiate the unpredictability of the art and that, given the unpredictability, the specification 
does not provide sufficient information to guide those of skill to make and use the claimed 
invention across the full scope of the claims. In view of the discussion above, the references 
cited by the Examiner fails to support the position that claims 1-11,19, and 22-25 are not 
enabled. 
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The Examiner also states that the specification fails to teach how to use a complement of 
nucleic acids encoding pesticidal protein with 95% identity to SEQ ID NO:2 or 4 or nucleic acids 
with 95% identity to SEQ ID NO:l or 3. The Applicant respectfully disagrees. Page 8, lines 2-7 
state that the complement of a claimed nucleotide sequence is one that would hybridize to a 
given nucleotide sequence to thereby form a stable duplex. Page 13, lines 24-25 state that 
hybridization methods (using, for example, complementary sequences) can be used to screen 
cDNA or genomic libraries for delta-endotoxin sequences having substantial identity to the 
sequences of the invention. Therefore, the specification clearly teaches how to use the 
complement of a nucleic acid sequence of the invention to, for example, screen for similar delta- 
endotoxin sequences. However, to expedite prosecution, claim 1 has been amended to delete the 
subject matter pertaining to complementary sequences. 

The Examiner further maintains that the specification does not enable the transformation 
of any plant with a nucleotide sequence with 95% identity to the nucleotide sequence of SEQ ID 
NO:l or 3, or a nucleotide sequence encoding SEQ ID NO:2 or 4 because undue trial and error 
experimentation would be required to screen for nucleotide sequences encompassed by the 
claims and plants transformed therewith to identify those plants with pesticidal activity. As 
discussed above, the amount of experimentation required to identify a nucleotide sequence that 
has 95% sequence identity to SEQ ID NO:l or 3, or to a nucleotide sequence encoding SEQ ID 
NO:2 or 4 is not undue. With respect to transformation of plants with these sequences, the 
specification provides routine methods for transformation of plants with nucleotide sequences 
and the regeneration of transgenic plants. See pages 20-27 and Examples 12 and 13. Given the 
guidance provided in the specification and the knowledge in the art, the claims directed to 
transformation of a plant with a delta-endotoxin sequence, or variant or fragment thereof, are 
fully enabled. 

In light of the above arguments, the level of skill and knowledge in the art, and the 
guidance provided in the specification, Applicants respectfully submit that the specification is 
enabling for the full scope of claims 1-11,19, and 22-23. Thus, the rejection of the claims under 
35 U.S.C. § 1 12, first paragraph, for lack of enablement should be withdrawn. 
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Written Description 

Claims 1-11, 19, and 22-25 were further rejected under 35 U.S.C. § 112, first paragraph, 
as failing to satisfy the written description requirement. The rejection is respectfully traversed. 

The Examiner asserts that the disclosure is insufficient to support claims that are drawn to 
a genus of nucleic acids having 95% sequence identity to SEQ ID NO:l or 3, or nucleic acids 
encoding polypeptides having 95% identity to SEQ ID NO:2 or 4. 

In order to satisfy the written description requirement of 35 U.S.C. § 1 12, the application 
must reasonably convey to one skilled in the art that the applicant was in possession of the 
claimed subject matter at the time the application was filed. Vas-Cath Inc. v. Mahurkar, 935 
F.2d 1555, 1563, 19 U.S.P.Q.2d (BNA) 1111, 1 1 17 (Fed. Cir. 1991). Every species 
encompassed by the claimed invention, however, need not be disclosed in the specification to 
satisfy the written description requirement of 35 U.S.C. § 1 12, first paragraph. Utter v. Hiraga, 
845 F.2d 993, 6 USPQ2d 1709 (Fed. Cir. 1988). The Federal Circuit has made it clear that 
sufficient written description requires simply the knowledge and level of skill in the art to permit 
one of skill to immediately envision the product claimed from the disclosure. Purdue Pharm 
LP. v. Faulting In., 230 F.3d 1320 1323, 596 USPQ2d 1481, 1483 (Fed. Cir. 2000) ("One 
skilled in the art must immediately discern the limitations at issue in the claims."). 

Moreover, the "Guidelines for Examination of Patent Applications Under 35 U.S.C. 
§1 12, f 1, 'Written Description' Requirement" state that a genus may be described by "sufficient 
description of a representative number of species ... or by disclosure of relevant, identifying 
characteristics , i.e. structure or other physical and/or chemical properties." Id at 1 106. This is 
in accordance with the standard for written description set forth in Regents of the University of 
California v. Eli Lilly & Co, 1 19 F.3d 1559 (Fed. Cir. 1997), where the court held that "[a] 
written description of an invention involving a chemical genus, like a description of a chemical 
species, 'requires a precise definition, such as by structure, formula, or chemical name' of the 
claimed subject matter sufficient to distinguish it from other materials." 119 F.3d at 1568, citing 
Fiers v. Revel 984 F.2d 1 164 (Fed. Cir. 1993). In Enzo Biochem, Inc. v. Gen-Probe, Inc., 323 
F.2d 926 (Fed. Cir. 2002), the Federal Circuit adopted the PTO standard for written description, 
stating: 
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[U]nder the Guidelines, the written description requirement would be met ... if 
the functional characteristics of [a genus of polypeptides] were coupled with a 
disclosed correlation between that function and a structure that is sufficiently 
known or disclosed. We are persuaded by the Guidelines on this point and adopt 
the PTO's applicable standard for determining compliance with the written 
description requirement." 

The claims of the present application meet the requirements for written description set 
forth by the Federal Circuit. The claims recite that the nucleic acid have 95% sequence identity 
to the nucleotide sequence of SEQ ID NO: 1 or 3, or to a nucleotide sequence encoding SEQ ID 
NO:2 or 4. Methods for determining percent identity between any two sequences are known in 
the art and are provided in the specification. See pages 9-11. As discussed above, nucleotide 
sequences for full-length AXMI-007 (SEQ ID NO:l), as well as variants and fragments (e.g., 
SEQ ID NO:3) are disclosed in the specification. Numerous delta-endotoxin sequences were 
also generally known in the art at the time the application was filed. Moreover, detailed 
information regarding the structure of delta-endotoxins and the reported functions associated 
with particular structures, regions, and motifs was also available in the prior art as well as 
discussed in detail on page 2, lines 21-28, Figure legend 1, and on page 12. At the time of filing, 
it was known that delta-endotoxins generally comprise three domains, a seven-helix bundle that 
is involved in pore formation, a three-sheet domain that has been implicated in pore formation, 
and a beta-sandwich motif. See Li et al (1991) Nature 305:815-821. Thus, the recitation of 
polypeptides having a particular percent identity to a delta-endotoxin provides very specific and 
defined structural parameters of the sequences that can be used in the invention. These structural 
limitations are sufficient to distinguish the nucleotide and amino acid sequences of the invention 
from other nucleic acids and polypeptides and thus sufficiently define the genus of sequences 
useful in the practice of the present invention. 

The Examiner is reminded that the description of a representative number of species does 
not require the description to be of such specificity that it would provide individual support for 
each species that the genus embraces. 66 Fed. Reg. 1099, 1 106 (2000). Satisfactory disclosure 
of a "representative number" depends on whether one of skill in the art would recognize that the 
applicant was in possession of the necessary common attributes or features of the elements 
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possessed by the members of the genus in view of the species disclosed. 66 Fed. Reg. 1099, 
1 106 (2000). Here, Applicants have provided nucleotide and amino acid sequences for 
exemplary pesticidal sequences and variants and fragments thereof encompassed by the claims. 
Moreover, numerous delta-endotoxin sequences were known and readily available in the art. 
Therefore, Applicants submit that in view of the present disclosure and the knowledge and level 
of skill in the art the skilled artisan would envision the claimed invention. 

The description of a claimed genus can be by structure, formula, chemical name, or 
physical properties. See Ex parte Maizel, 27 USPQ2d 1662, 1669 (B.P.A.I. 1992), citing Amgen 
v. ChugaU 927 F.2d 1200, 1206 (Fed. Cir. 1991). A genus of polypeptides may therefore be 
described by means of a recitation of a representative number of amino acid sequences that fall 
within the scope of the genus, or by means of a recitation of structural features common to the 
genus, which features constitute a substantial portion of the genus. See Regents of the University 
of California v. Eli Lilly & Co., 1 19 F.3d 1559, 1569 (Fed. Cir. 1997); see also Guidelines for 
Examination of Patent Applications Under the 35 U.S.C. 112, first paragraph, "Written 
Description" Requirement, 66 Fed. Reg. 1099, 1 106 (2000). The recitation of a predictable 
structure (i.e., an amino acid sequence having a specified percent identity or number of 
contiguous amino acid residues of a particular sequence) is sufficient to satisfy the written 
description requirement. Thus, the application provides the structural features that characterize 
sequences having at least 95% sequence identity to SEQ ID NO: 1 or 3, or to a nucleotide 
sequence encoding SEQ ID NO:2 or 4 that retain pesticidal activity. 

An Applicant may also rely upon functional characteristics in the description, provided 
there is a correlation between the function and structure of the sequences recited in the claims. 
Id., citing Lilly at 1568. The present claims further recite functional characteristics that 
distinguish the sequences of the claimed genus. Specifically, the claims recite that the sequences 
having at least 95% sequence identity to SEQ ID NO:l or 3, or to a nucleotide sequence 
encoding SEQ ID NO:2 or 4 encode proteins which have pesticidal activity. The specification 
and the art provide standard assays that may be used to measure pesticidal activity. See, for 
example, page 11, lines 20-24. Furthermore, as noted above, Applicants have disclosed fragment 
sequences that retain pesticidal activity (e.g., SEQ ID NO:3, which encodes a fragment of SEQ 
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ID NO:2). Accordingly, both the structural and functional properties that characterize the genus 
of sequences that can be used to practice the invention are specifically recited in the claims. The 
sequences that fall within the scope of the claims can readily be identified by the methods set 
forth in the specification. 

In summary, the specification provides an adequate written description of the claimed 
invention. In particular, the specification provides: nucleotide and amino acid sequences for 
pesticidal toxins, and variants and fragments thereof, that fall within the scope of the claims; 
guidance regarding sequence alterations that do not disrupt pesticidal activity of a toxin; 
guidance for determining percent identity; and methods for assaying the pesticidal activity of 
proteins. In view of the above remarks and claim amendments, Applicants submit that the 
relevant identifying structural and functional properties of the genus of sequences of the present 
invention would be clearly recognized by one of skill in the art. Consequently, Applicants were 
in possession of the invention at the time the application was filed, and the rejection of the claims 
under 35 U.S.C. § 1 12, first paragraph, for lack of written description should be withdrawn. 

The Rejection of the Claims Under 35 U.S.C. § 1 12, Second Paragraph Should Be Withdrawn 

Claims 3, 1 1, and 19, as well as dependent claims therefrom, were rejected under 35 
U.S.C. § 1 12, second paragraph as being indefinite for failing to particularly point out and 
distinctly claim the subject matter that Applicant regards as the invention. 

Claim 3 has been amended to recite "relative to the GC content of SEQ ID NO: 1 or 3." 
Support for this amendment can be found on page 25, lines 2-3. Claim 1 1 has been amended to 
recite "the plant" of claim 9. Claim 9 depends depends from claim 8, which depends from claim 
6, which depends from claim 4, which depends from claim 1. Therefore, as amended, claim 1 1 
now describes a transgenic seed derived from a plant that comprises a host cell that contains a 
vector comprising the nucleic acid of claim 1. Claim 19 has been amended to recite "the nucleic 
acid molecule" such that claim 19 now encompasses a method for producing a polypeptide by 
culturing a host cell that contains a vector that comprises the nucleic acid of claim 1 . 
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Accordingly, the rejection of claims 3, 1 1, and 19 under 35 U.S.C. § 1 12, second paragraph 
should be withdrawn. 

The Rejection of the Claims Under 35 U.S.C. $ 102 Should Be Withdrawn 

Claims 22 and 23 were rejected under 35 U.S.C. § 102(b) as being anticipated by Barton 
et al (U.S. Patent No. 6,833,449). Barton et al teach tobacco plants transformed with a nucleic 
acid encoding a Cryl protein. The Examiner states that the recitation of "a" before "nucleotide 
sequence of SEQ ID NO:l or 3" in parts (a) and (d) and "an" before "amino acid sequence of 
SEQ ID NO:2 or 4" in part (c) encompasses nucleic acids that comprise the full-length sequence 
of SEQ ID NO: 1 or 3, or any portion of SEQ ID NO: 1 or 3 or that encode the full-length of SEQ 
ID NO:2 or 4 or any portion of SEQ ID NO:2 or 4. Claims 22 and 23 have been amended to 
recite "the" before "nucleotide sequence" and "amino acid sequence." As such, the Cryl protein 
taught in Barton et al does not comprise the sequence of SEQ ID NO:l or 3, nor a sequence with 
95% identity to SEQ ID NO: 1 or 3. Accordingly, the rejection of claims 22 and 23 under 35 
U.S.C. § 102(b) should be withdrawn. 

It is not believed that extensions of time or fees for net addition of claims are required, 
beyond those that may otherwise be provided for in documents accompanying this paper. 
However, in the event that additional extensions of time are necessary to allow consideration of 
this paper, such extensions are hereby petitioned under 37 CFR § 1.136(a), and any fee required 
therefore (including fees for net addition of claims) is hereby authorized to be charged to Deposit 
Account No. 16-0605. 



Respectfully submitted, 



W. Murray Sptyill 
Registration No. 32,943 
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BACKGROUND AND HISTORY OF PESTICIDAL 
CRYSTAL PROTEIN NOMENCLATURE 

Since the first cloning of an insecticidal crystal protein gene 
from Bacillus thuringiensis (91), many other such genes have 
been isolated. Initially, each newly characterized gene or pro- 
tein received an arbitrary designation from its discoverers: icp 
(64); cry (21, 121); kurhdl (31); Bta (88); btl, bt2, etc. (40); 
type B and type C (43); and 4.5 kb, 5.3 kb, and 6.6 kb (55). The 
first systematic attempt to organize the genetic nomenclature 
relied on the insecticidal activities of crystal proteins for the 
primary ranking of their corresponding genes (44). The cry! 
genes encoded proteins toxic to lepidopterans; cryll genes en- 
coded proteins toxic to both lepidopterans and dipterans; crylll 
genes encoded proteins toxic to coleopterans; and cryW genes 
encoded proteins toxic to dipterans alone. 

This system provided a useful framework for classifying the 
ever-expanding set of known genes. Inconsistencies existed in 
the original scheme, however, due to attempts to accommo- 
date genes that were highly homologous to known genes but 
did not encode a toxin with a similar insecticidal spectrum. The 
cryllB gene, for example, received a place in the lepidopteran- 
dipteran class with cryllA, even though toxicity against dipter- 
ans could not be demonstrated for the toxin designated 
CryllB. Other anomalies arose after the nomenclature was 
established. The protein named CrylC, for example, was re- 
ported to be toxic to both dipterans and lepidopterans (103), 
while the protein designated CrylB was reported to be toxic to 
both lepidopterans and coleopterans (8). Because the nomen- 
clature system provided no central committee or database to 
maintain standardization, new genes encoding a diverse set of 
proteins without a common insecticidal activity each received 
the name cryV, based on the next available Roman numeral 
(32, 46, 67, 100, 102, 108). 

PROPOSED NOMENCLATURE 

We propose in this review a revised nomenclature for the cry 
and cyt genes. To organize the wealth of data produced by 
genomic sequencing efforts, a new nomenclatural paradigm is 
emerging, exemplified by the internationally recognized cyto- 
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chrome P-450 superfamily nomenclature system (68a, 122a). 
Our proposal conforms closely to this model both in concep- 
tual basis and in nomenclature format. The underlying basis of 
this type of system is to assign names to members of gene 
superfamilies according to their degree of evolutionary diver- 
gence as estimated by phylogenetic tree algorithms. The no- 
menclature format in such a system is designed to convey rich 
informational content about these relationships by appending 
to the mnemonic root a series of numerals and letters assigned 
in a hierarchical fashion to indicate degrees of phylogenetic 
divergence. This change from a function-based to a sequence- 
based nomenclature allows closely related toxins to be ranked 
together and removes the necessity for researchers to bioassay 
each new protein against a growing series of organisms before 
assigning it a name. 

In our proposed revision, Roman numerals have been ex- 
changed for Arabic numerals in the primary rank (e.g., 
Cryl Aa) to better accommodate the large number of expected 
new proteins. The mnemonic Cyt to designate crystal proteins 
showing a general cytolytic activity in vitro has been retained 
because of its historical precedent and entrenchment in the 
research literature. Our definition of a Cry protein is rather 
broad: a parasporal inclusion (crystal) protein from B. thurin- 
giensis that exhibits some experimentally verifiable toxic effect 
to a target organism, or any protein that has obvious sequence 
similarity to a known Cry protein. Similarly, Cyt denotes a 
parasporal inclusion (crystal) protein from B. thuringiensis that 
exhibits hemolytic activity, or any protein that has obvious 
sequence similarity to a known Cyt protein. By these criteria, 
the nontoxic 40-kDa crystal protein from B. thuringiensis subsp. 
thompsoni, for example, has been excluded from our list, but 
the lepidopteran-active 34-kDa protein (now Cryl5A) en- 
coded by an adjacent gene has been included (11). 

The freely available software applications CLUSTAL W 
(110) and PHYLIP (27) define the sequence relationships 
among the toxins to form the framework of the new nomen- 
clature. In the first step, CLUSTAL W aligns the deduced 
amino acid sequences of the full-length toxins and produces a 
distance matrix, quantitating the sequence similarities among 
the set of toxins. CLUSTAL W default settings are employed, 
except that the "delay divergent sequences" setting in the mul- 
tiple-alignment parameter menu is reduced from 40 to 0%. 
The NEIGHBOR application within the PHYLIP package 
then constructs a phylogenetic tree from the distance matrix by 
an unweighted pair-group method using arithmetic averages 



807 



808 CRICKMORE ET AL. 

(UPGMA) algorithm. The TREEVIEW application (73), with 
the "phylogenetic tree" and "ladderize left" options selected, 
produces a graphic presentation of the resulting tree. 

We have applied this procedure to the set of holotype se- 
quences given in Table 1 to produce the phylogenetic tree 
presented in Fig. 1. Vertical lines drawn through the tree show 
the boundaries used to define the various nomenclatural ranks. 
The name given to any particular toxin depends on the location 
of the node where the toxin enters the tree relative to these 
boundaries. A new toxin that joins the tree to the left of the 
leftmost boundary will be assigned a new primary rank (an 
Arabic number). A toxin that enters the tree between the left 
and central boundaries will be assigned a new secondary rank 
(an uppercase letter). It will have the same primary rank as the 
other toxins within that cluster. A toxin that enters the tree 
between the central and right boundaries will be assigned a 
new tertiary rank (a lowercase letter). Finally, a toxin that joins 
the tree to the right of the rightmost boundary will be assigned 
a new quaternary rank (another Arabic number). Toxins with 
identical sequences but isolated independently will receive sep- 
arate quaternary ranks. 

By this method each toxin will be assigned a unique name 
incorporating all four ranks. A completely novel toxin would 
currently be assigned the name Cry23Aal. For the sake of 
convenience, however, we propose that the inclusion of the 
tertiary rank a and quaternary rank 1 be optional, their use 
dictated only by a need for clarity. This new toxin could there- 
fore simply be referred to as Cry23A. 

In choosing locations for rank boundaries, we attempted to 
construct a nomenclature reflecting significant evolutionary 
relationships while at the same time minimizing changes from 
the gene names assigned under the old system. In the resulting 
system, proteins with a common primary rank are similar 
enough that the percent identity can be defined with some 
confidence. Proteins with the same primary rank often affect 
the same order of insect; those with different secondary and 
tertiary ranks may have altered potency and targeting within an 
order. At the tertiary rank, differences can be due to the ac- 
cumulation of dispersed point mutations, but often they appear 
to have resulted from ancestral recombination events between 
genes differing at a lower rank level (9). The quaternary rank 
was established to group "alleles" of genes coding for known 
toxins that differ only slightly, either because of a few muta- 
tional changes or an imprecision in sequencing. To avoid con- 
fusion, however, the reader should bear in mind the differences 
between the quaternary rank number and the classical concept 
of the allele. Any cry gene specified with a quaternary rank is 
a natural isolate. No assumption about functionality is implied 
by the presence of this rank number in the gene name. In 
contrast, an allele number would be assumed, unless paren- 
thetical or subscripted information indicated otherwise, to de- 
note a nonfunctional mutant form of a wild-type gene found at 
a discrete genetic locus. Because of the somewhat modular 
nature of the Cry proteins and the effect that various segmental 
relationships could have on the clustering algorithm, it is likely 
that these boundaries will move slightly or even bend as the 
addition of new sequences changes the topology of the phylo- 
genetic tree. Currently the boundaries represent approxi- 
mately 95, 78, and 45% sequence identity. 

A B. thuringiensis Pesticidal Crystal Protein Nomenclature 
Committee, consisting of the authors of this paper, will remain 
as a standing committee of the Bacillus Genetic Stock Center 
(BGSC) to assist workers in the field of B, thuringiensis genetics 
in assigning names to new Cry and Cyt toxins. The correspond- 
ing gene or protein sequences must first be deposited into a 
publicly accessible database (GenBank, EMBL, or PIR) and 
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released by the repository for electronic publication in the 
database so that the scientific community may conduct an 
independent analysis. Researchers should submit new se- 
quences directly to the BGSC director (D. R. Zeigler), either 
by electronic mail (zeigler.l@osu.edu) or on computer dis- 
kette. The director will analyze the amino acid sequence as 
described above and suggest the appropriate name, subject to 
the approval of the committee. The committee will periodically 
review the literature of the Cry and Cyt toxins and publish a 
comprehensive list. This list, alongside other relevant informa- 
tion, will also be available via the Internet at the following 
URL: http:/Avww.biols.susx.ac.uk/Home/Neil_Crickmore/Bt/. 

The current list of cry and cyt genes (including quaternary 
ranks) is given in Table 1. New gene names are listed with their 
previous names, their GenBank accession numbers, and pub- 
lished references. The quaternary ranks were assigned in the 
order that the gene sequences were discovered in the literature 
or submitted to the committee. Genes assigned the quaternary 
rank 1 represent holotype sequences. 

The boundaries shown in Fig. 1 allow most cry genes to 
retain the names they received under the system of Hofte and 
Whiteley (44), after a substitution of Arabic for Roman nu- 
merals. There are a few notable exceptions: crylG becomes 
cry9A, cryHIC becomes crylAa, crylllD becomes cry3C, crylVC 
becomes crylOA, cryTVD becomes cryllA, cytA becomes cytlA, 
and cytB becomes cytlA (Table 1). Under the revised system, 
the known Cry and Cyt proteins fall into 24 sets at the primary 
rank— Cyt 1, Cyt2, and Cryl through Cry22. 

ROBUSTNESS OF THE NOMENCLATURE 

The robustness of the current naming process was assessed 
by a number of additional analyses. The choice of clustering 
algorithm (unweighted pair-group method using arithmetic av- 
erages) was driven largely by the consistent location of a root 
and constant branch lengths, resulting in a common vertical 
alignment of sequence names and essentially allowing a "ruler 
across the tree" approach to naming. It has the drawback of 
imposing a common evolutionary clock on the clustering pro- 
cess, an assumption that cannot be assured. The distance met- 
ric related to percent identity (essentially 1 minus the fraction 
of identical residues of the total compared without gaps) is the 
one most commonly found as the output of sequence compar- 
ison programs, including CLUSTAL W. For phylogenetic anal- 
ysis, a more usual distance metric relates to the number of 
substitutions per site to convert one sequence to the other 
(e.g., DayhofPs point accepted mutation [PAM]) and accounts 
for the possibility of multiple substitutions per site as the se- 
quences are more divergent. The latter method has the draw- 
back of being more computationally intensive, and, for very 
divergent sequences, requiring too large a value, resulting in 
numeric computation failures. They also differ in the way se- 
quences of unequal length are handled, with the percent iden- 
tity method typically ignoring excess sequence and the other 
methods assigning a penalty. This is particularly important for 
crystal proteins, since a number of them lack the C-terminal 
protoxin segments yet are quite related to some longer toxins 
in the N-terminal toxin segment; we feel that the stronger 
association of such relationships found by the percent identity 
method is preferred. 

To assess the effect of using the neighbor-joining method to 
generate an unrooted tree, CLUSTAL W routines were used 
to generate such a tree with 1,000 bootstraps of the sequence 
alignment we used for Fig. 1. When an appropriate outgroup 
was chosen, the resulting tree (not shown) resembled our Fig. 
1. The bootstrap values indicated that the tree thus generated 
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TABLE 1. Known cry and cyt gene sequences with revised nomenclature assignments 



Revised 
gene name 


Original gene or 
protein name 


Accession 
no. 


Coding 
region* 




crylAal 


r /i / \ 

cry//4(a) 


X if 1 1 OCA 

Ml 1250 


527-4054 


AO 

92 


crylAa2 


cry LA {a) 


Miuyi7 


153— >2y55 


AO 

y© 


crylAaJ 


cryIA(a) 


1JU0348 


73-3600 


AA 

99 


crylAa4 


cry LA {a) 


A13535 


1 'T.CjQ 

1-35ZO 


62 




,7 /< / _\ 

cryL4!(a) 


JJ17510 


Ol liCAO 


113 


crylAa6 


cry LA {a) 


U436U5 


1->1ooU 


63 


r iLi 

cry J Ao I 


cry LA (D) 


Ml Joyo 


142-3oUo 


1 1 A 

119 


crylAol 




Ml 2661 


155-3622 


111 


cry 1 Ao J 


„_ . 7 /i /i \ 

c/yZ/4 (o) 


M15271 


156-3620 


31 


crylAo4 


c/yL4 (o) 


DU0117 


163-3627 


50 


i iff 
cry 1 Ad j 


__. I /| / I \ 

c/yjyl (o) 


vn/i^no 


1 ii 1 TiCAC 

141-3605 


A(\ 

40 


__, f /J LX 

cry J Ado 


——.I /j /i \ 


M37263 


73-3537 


37 


crylAb7 




X13233 


1-3465 


36 


cry I Abo 


cryIA(b) 


Ml 6463 


157-3621 


69 


crylAb9 


cryIA(b) 


vc ii nin 

X54939 


73-3537 


13 


7/11, 7/1 

crylAoW 


cryIA(b) 


A29125 





OO 

28 


__. 7/| 7 

cry J Ac 1 


cryIA(c) 


Ml 1068 


ioo inn 

3oo-3y21 


3 


. 7 /i _ «5 

cry J Ac 2 


; i / \ 

cry//l (c) 


M35524 


23y-376y 


117 


cry 1 Ac 3 


c/yL4(c) 


X54159 


1 1 A ^1 AO 

33y->2192 


1 O 

18 


crylAc4 


crylAyc) 


M7324y 


1-3534 


O A 

84 


__. 7 /j c 

cry 1 Ac j 




M73248 


1-3531 


Ol 

83 


7 /j 

cry I Acq 


c/y//!(c) 


U43606 


1— > 1821 


63 


crylAc7 


c/y£4(c) 


U877y3 


976-4509 


TO 

38 


7 /I _ o 

cry 1 AOS 


cryi/1 (c j 


Uo/3y / 


15 3-3 Oo6 


*71 

71 


crylAc9 


7 /| / _\ 

c/yi/I (c) 


I TOflOTT 

Uoyo72 


lOO 

3oo-3y21 


33 


, 7 /i _ 7/1 

cry 1 Ac I U 


AJUU2514 


300-3921 


1 AT 

107 


crylAal 


cryL4 (c) 


M73250 


1-3537 


79 


, 7/|_7 

crylAel 




x>i/:cTco 
M65252 


01 IC^I 

81-3623 


60 


__. I Aft 

cry I A] 1 


icp 


U 82003 


1 72- > 2905 


A(\ 

49 


crylBal 


crylB 


X06711 


1-3684 


10 


crylBa2 


VflC7Ai 

X95704 


186-3869 


105 


crylBbl 


ET5 


L32020 


67-3753 


25 


__. 7 r>„ 7 

cry I be I 


cryIB(c) 


Z46442 


141-3839 


6 


. 7 r> j 7 

cry I tid 1 


cry hi 


U 70726 




1 1 
12 


cry! La I 


crylC 


X07518 


47-3613 


45 


crylCa2 


crylC 


XI 3620 


241->2711 


88 


crylLaS 


crylC 


M73251 


1-3570 


79 


crylCa4 


crylC 


A27642 


234-3800 


114 


crylCaS 


crylC 


X96682 


l->2268 


106 


cry 1 Lao 


cry/C 


Ayooo3 




1 na 
1U6 


crylCa7 


crylC 


X96684 


l->2268 


106 


crylLul 


cry/C(p) 


x if moon 


296-3823 


AO 

4o 


cry] Dal 


crylD 


X54160 


0£y1 T7CO 

264-3758 


42 


crylDbl 


prtB 


Z2251 1 


241-3720 


56 


crylEal 


crylE 


vonoc 

X53985 


1 30-3642 


115 


crylbal 


crylL 


VC/C1 /I yl 

A-)Ol44 


1 TCI T 

1-3513 


I 


crylEa3 


crylE 


M73252 


1-3513 


82 


crylEa4 


U94323 


IOO 'iAAfl 

388-3900 


47 


„_. 7 ri 7 

cryltbl 


crylb{b) 


M73253 


1-3522 


O 1 

81 


crylFal 


crylF 


M 63 897 


AHO "7AAH 

478-3999 


14 


crylraz 


crylr 


M73254 


1-3525 


on 
oU 


, 7 pi 7 

crylbol 


prtu 


Z.22512 


4o3-4U04 


56 


crylUal 


_ w /i 


ZzzMU 


1K.tLA 

67-3564 


56 


crylLraZ 


crylM 


Y0932O 


692-42 1U 


n/c 
96 


crylGbl 


cryH2 


U70725 




12 


cry 1 Hal 


prtC 


Z22513 


530-4045 


56 


crylHbl 


U35780 


728-4195 


53 


cry Hal 


cryV 


X62821 


355-2511 


108 


crylla2 


cryV 


M98544 


1-2157 


34 


crylla3 


cryV 


L36338 


279-2435 


100 


cry huh 


cryv 




fi1 9917 
Ol— ZZ1 / 




crylla5 


cryV159 


Y08920 


524-2680 


94 


cryllbl 


cryV465 


U07642 


237-2393 


100 


crylJal 


ET4 


L32019 


99-3519 


25 


crylJbl 


ET1 


U31527 


177-3686 


116 


crylKal 




U28801 


451-4098 


52 


cry2Aal 


c/y/^4 


M31738 


156-2054 


20 


cry2Aa2 




M23723 


1840-3738 


123 


cry2Aa3 


D86064 


2007-3911 


89 


cry2Abl 


cryllB 


M23724 


1-1899 


123 



Revised 
gene name 


Original gene or 
protein name 


Accession 
no. 


2125-3990> 


Reference 


cry2Ab2 


cryllB 


X55416 


874-2775 


17 


cry2Acl 


cryllC 


X57252 


2125-3990 


124 


cry3Aal 


crylllA 


M22472 


25-1956 


39 


cry3Aa2 


crylllA 


J02978 


241-2172 


93 


cry3Aa3 


crylllA 


Y00420 


566-2497 


41 


cry3Aa4 


crylllA 


M30503 


201-2132 


65 


cry3Aa5 


crylllA 


M37207 


569-2500 


22 


cry3Aa6 


crylllA 


U 10985 


569-2500 


1 


cry3Bal 


crylllB2 


X17123 


25->1977 


101 


cry3Ba2 


crylllB 


A07234 


342-2297 


85 


cry3Bbl 


crylllBb 


M89794 


202-2157 


24 


cry3Bb2 


cryIUC{b) 


U31633 


144-2099 


23 


cry3Cal 


crylllD 


X59797 


232-2178 


59 


cry4Aal 


crylVA 


Y00423 


1-3540 


121 


cry4Aa2 


crylVA 


D00248 


393-3935 


95 


cry4Bal 


crylVB 


X07423 


157-3564 


16 


cry4Ba2 


crylVB 


X07082 


151-3558 


112 


cry4Ba3 


crylVB 


M20242 


526-3930 


125 


cry4Ba4 


crylVB 


D00247 


461-3865 


95 


cry5Aal 


cryVA(a) 


L07025 


1->4155 


102 


crySAbl 


cryVA(b) 


L07026 


l->3867 


67 


cry5Acl 




134543 


l->3660 


76 


crySBal 


PS86Q3 


U 19725 


l->3735 


76 


cry6Aal 


cryVIA 


L07022 


1->1425 


68 


cry6Bal 


cryVlB 


L07024 


1->1185 


67 


crylAal 


crylllC 


M64478 


184-3597 


58 


cry7Abl 


crylllC(b) 


U04367 


1->3414 


75 


cry7Ab2 


crylllC(c) 


U04368 


1->3414 


75 


crySAal 


crylllE 


U04364 


1->3471 


29 


crySBal 


crylUG 


U04365 


l->3507 


66 


crySCal 


crylllF 


U04366 


1-3447 


70 


cry9Aal 


crylG 


X58120 


5807-9274 


104 


cry9Aa2 


crylG 


X58534 


385->3837 


32 


cry9Bal 


cryX 


X75019 


26-3488 


97 


cry9Cal 


crylH 


Z37527 


2096-5569 


57 


cry9Dal 


N141 


D85560 


47-3553 


4 


cry9Da2 




AF042733 


<1->1937 


122 


crylOAal 


crylVC 


Ml 2662 


941-2965 


111 


cryllAal 


crylVD 


M31737 


41-1969 


21 


cryllAa2 


crylVD 


M22860 


< 1-235 


2 


cryllBal 


Jeg80 


X86902 


64-2238 


19 


cryllBbl 


94 kDa 


AF017416 




72 


cry!2Aal 


cryVB 


L07027 


1->3771 


67 


cry!3Aal 


cryVC 


L07023 


1-2409 


90 


cryl4Aal 


cryVD 


U 13955 


1-3558 


77 


crylSAal 


34kDa 


M76442 


1036-2055 


11 


cry!6Aal 


cbm71 


X94146 


158-1996 


5 


cryl7Aal 


cbm72 


X99478 


12-1865 


5 


crylSAal 


cryBPl 


X99049 


743-2860 


126 


cryWAal 


Jeg65 


Y07603 


719-2662 


86 


cry!9Bal 




D88381 




87 


cry20Aal 


86kDa 


U82518 


60-2318 


61 


cry21Aal 




132932 


1-3501 


74 


cry22Aal 




134547 


1-2169 


76 


cytlAal 


cytA 


X03182 


140-886 


1 1 o 

118 


cytlAa2 


cytA 


X04338 


509-1255 


120 


cytlAa3 


cytA 


Y00135 


36-782 


26 


cytlAa4 


cytA 


M35968 


67-813 


30 


cytlAbl 


cytM 


X98793 


28-777 


109 


cytlBal 




U37196 


1-795 


78 


cyt2Aal 


cytB 


Z14147 


270-1046 


51 


cyt2Bal 


"cytB" 


U52043 


287-655 


35 


cyt2Bbl 




U82519 


416-1204 


15 



a The symbols < and > indicate that the coding region extends up- or downstream, respectively, from the known sequence data. 
b Only the polypeptide sequence has been reported. 
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CrylGa 
CrylGb 
CrylDa 
!— CrylDb 

I CrylHa 

1 CrylHb 

is— CrylEa 

I CrylEb 

5 CrylJa 

CrylJb 
CrylCa 

I CrylCb 

A CrylBb 

^ Cry 1 Be 

I CrylBd 

- — CrylBa 
CrylKa 
Crylla 
Cryllb 

k Cry7Aa 

Cry7Ab 
Cry9Ca 
Cry9Da 
Cry9Ba 
Cry9Aa 
Cry8Aa 
Cry8Ba 

f Cry8Ca 

- Cry3Aa 

Cry3Ca 
Cry3Ba 
Cry3Bb 
Cry4Aa 
Cry4Ba 
CrylOAa 
Cryl9Aa 
Cryl9Ba 
Cry20Aa 
Cryl6Aa 
Cryl7Aa 
Cry5Aa 

« CrySAc 

s Cry5Ab 

Cry5Ba 
Cryl2Aa 
Cry21Aa 
Cryl3Aa 
Cryl4Aa 

5 Cry2Aa 

I Cry2Ab 

■g Cry2Ac 

* Cryl8Aa 

CryllBa 
CryllBb 

f. CryllAa 

CytlAa 

$ CytlAb 

* CytlBa 

Cyt2Ba 

^ Cyt2Bb 

Cyt2Aa 
Cryl5Aa 
Cry6Aa 
is — -Cry6Ba 
I Cry22Aa 



Percent Amino Acid Sequence Identity 

FIG. 1. Phylogram demonstrating amino acid sequence identity among Cry and Cyt proteins. This phylogenetic tree is modified from a TREEVIEW visualization 
of NEIGHBOR treatment of a CLUSTAL W multiple alignment and distance matrix of the full-length toxin sequences, as described in the text. The gray vertical bars 
demarcate the four levels of nomenclature ranks. Based on the low percentage of identical residues and the absence of any conserved sequence blocks in 
multiple-sequence alignments, the lower four lineages are not treated as part of the main toxin family, and their nodes have been replaced with dashed horizontal lines 
in this figure. 



Main Cry 
Lineage 



had significant branch points deeper in the tree than the cho- 
sen primary rank in the nomenclature. This sort of analysis was 
rejected as unsuitable for the purposes of Cry nomenclature 
due to the generally ragged branch lengths it produced and the 
requirement for the careful choice of an outgroup. 
An alternative method of clustering protein sequences, ca- 



pable of handling sequences that are quite diverse, is parsi- 
mony analysis. A consensus tree generated from 100 boot- 
straps of such an analysis displaces the two incomplete Cryl 
sequences (CrylBd and CrylAf) and the two Cryl sequences 
lacking the C-terminal protoxin segments (Crylla and Cryllb) 
into a region of the tree populated with such shortened se- 
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quences (not shown). With the further exceptions of Cryl2A 
being interjected into the Cry5 cluster and a number of se- 
quences besides Cry6B clustering higher in the tree than 
Cry6A, the proposed nomenclature successfully reflects the 
grouping of sequences provided by this method of analysis as 
well. 

As noted above, the usual distance metrics for phylogenetic 
analysis account for multiple substitutions per site; most com- 
monly, the Dayhoff PAM metric is used. When this distance 
metric was applied to the alignment used to make Fig. 1, a 
large number of the sequence pairs were found to have infinite 
distance. Therefore, the main Cry lineage and the Cyt lineage 
were separately aligned, the distances were calculated, and the 
distance matrices were clustered by using the FITCH program 
(of the PHYLIP software package). This method of analysis 
revealed several strongly associated groups of sequences 
(>90% of trees) in the main Cry lineage that extend deeper 
into the tree than the primary rank assigned in the proposed 
nomenclature: Cryl; Cry3; Cry4; Cry7; the Cry5, Cryl2-Cryl3- 
Cryl4-Cry21 group; the Cry8-Cry9 group; the Cryl0-Cryl9 
group; the Cryl6-Cryl7 group; and the Cry2-Cryll-Cryl8 
group. Many of these groups, however, were separated by 
branch points that were either nonmajority or were found 
<60% of the time; thus, the arrangement of these groups 
would be likely to change with additional sequence additions. 
At the secondary rank, the only anomaly with respect to the 
proposed nomenclature was the interjection of the Crylla and 
Cryl lb sequences into the Cry IB group. This effect may be due 
to an artificially reduced distance between the Cryll sequences 
and the incomplete CrylBd sequence caused by the particular 
distance metric used. The Cyt lineage sequences were sepa- 
rated into the expected two primary rank groups that separate 
into the expected secondary rank groupings. This more stan- 
dard phylogenetic approach also suffers from an accentuated 
visual disorientation of uneven branch lengths and shortening 
of the more closely related branches, especially at the tertiary 
rank (lowercase letter), where a great deal of comparative 
work has been done among the Cryl toxins. 

In summary, the proposed nomenclature uses readily avail- 
able software that can be easily interpreted by investigators in 
the field and meets their needs as well as, or better than, 
alternative methods of analysis and presentation. When the 
holotype toxins were analyzed by alternative phylogenetic 
methods, the hierarchy implied by the nomenclature was es- 
sentially consistent with the resulting phylogenetic clustering, 
and the few exceptions were largely explainable by known 
properties of the sequences in question. 
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The structure of the 5 -endotoxin from Bacillus 
thuringiensis subsp. tenebrionis that is specifically 
toxic to Coleoptera insects (beetle toxin) has been 
determined at 2.5 A resolution. It comprises three 
domains which are, from the N- to C-termini, a 
seven-helix bundle, a three-sheet domain, and a p 
sandwich. The core of the molecule encompassing 
all the domain interfaces is built from conserved 
sequence segments of the active 5-endotoxins. 
Therefore the structure represents the general fold 
of this family of insecticidal proteins. The bundle 
of long, hydrophobic and amphipathic helices is 
equipped for pore formation in the insect mem- 
brane, and regions of the three-sheet domain are 
probably responsible for receptor binding. 



The 5-endotoxins are a family of insecticidal proteins produced 
by Bacillus thuringiensis (B.t.) during sporulation, having relative 
molecular masses (M r ) 60,000-70,000 (60K-70K) in the active 
form and specific toxicities against insects in the orders of 
Lepidoptera, Diptera and Coleoptera 1 * 2 . These toxins have been 
formulated into commercial insecticides for three decades 3 , and 
now insect-resistant plants are engineered by transformation 
with Lepidoptera-specific toxin genes 4 " 6 . In the bacterium 5- 
endotoxins are synthesized as protoxins of M T s 70K-135K and 
crystallize as a parasporal inclusion — 1 p in size, in which form 
they are ingested by the susceptible insect. The microcrystal 
dissolves in the alkaline pH of the midgut and the protoxin is 
cleaved by gut proteases to release the active toxin. 5-Endotoxins 
activated in vitro bind specifically and with high affinity (k D = 
0.1-20 nM) to protein receptors on brush-border membrane 
vesicles derived from the gut epithelium of target insects 7 " 9 and 
create leakage channels of 10-20 A diameter in the cell mem- 
brane 10 . In vivo such membrane lesions lead to swelling and 
lysis of the gut epithelium 11 and death of the insect ensues 
through starvation and septicaemia. Active 5-endotoxins of 
different specificities show five strongly conserved regions in 
their amino-acid sequences 1,12 . Exchanging sequence segments 
in the divergent regions between toxins of different specificities 
can produce active hybrids showing altered target 
specificity 13 " 15 . We have determined the atomic structure of a 
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Coleoptera-specific 5-endotoxin (CrylllA, beetle toxin) from 
B.t. subsp. tenebrionis 16 ' 18 to elucidate the structural basis for 
target specificity and membrane perforation by this family of 
proteins. 

Structure determination 

Parasporal crystals of the beetle toxin contain the full-length 
644-residue protoxin 17 as the minor component, and a product 
of bacterial processing with 57 residues removed from the N- 
terminus as the major component 19 . The latter (M r 67K) is 
similar in sequence to the active form of other 5-endotoxins. 
After solubilization, papain cleavage converts the mixture to the 
67K toxin (see legend to Table I). This was recrystallized in the 
original crystal form of the parasporal crystals, space group 
C222 x and cell dimensions 1 17.1 by 134.2 by 104.5 A, containing 
one molecule per asymmetric unit and 55% solvent by volume 18 . 

Initial evaluation of derivatives was carried out at 4,5 A resol- 
ution with data collected on the FAST TV diffractometer 20 using 
CuKa radiation. Complete datasets (Table 1) were then collec- 
ted to 2.5 A resolution from native crystals using the imaging 
plate systems at the EMBL outstation at DESY and from the 
mercury and platinum derivatives on film at SRS Daresbury. 
The electron density map (Fig. 1) at 2.5 A resolution calculated 
with phases from multiple isomorphous replacement (mean 
figure of merit, 0.63) was easily interpretable and was improved 
by solvent flattening 21,22 . A continuous polypeptide chain from 
residue 61 to residue 644 at the C terminus was traced unam- 
biguously, and most side-chain atoms could be located in the 
map. The atomic model was built using the graphics program 
O (ref. 23) and had an initial ^-factor of 37% for all data to 
2.5 A. After preliminary refinement using the program X-PLOR 
(ref. 24), the current model, containing 584 amino acid residues 
and 40 bound water molecules, has an /^-factor of 19.9% and 
r.m.s. bond length deviation of 0.017 A. 

Description of the structure 

Overview. The beetle toxin is a wedge-shaped molecule with a 
radius of gyration of 58 A. As shown in Fig. 2a, it comprises 
three domains. Domain I, from the N terminus of the 67K toxin 
to residue 290, is a seven-helix bundle in which a central helix 
is completely surrounded by six outer helices tilted at about 
+20° to it (Fig. 36,c). Domain II, from residues 291 to 500, 
contains three antiparallel /3 sheets packed around a hydro- 
phobic core with a triangular cross-section (Fig. 4). Domain III, 
from residues 501 to 644 at the C terminus is a sandwich of two 
antiparallel P sheets (Fig. 5). Domains I and III make up the 
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TABLE 1 Data collection and phasing statistics 



Data collection 

Data 

Native 

CH 3 HgN0 3 

Hg(CH 3 C00) 2 

c/s-Pt(NH 3 ) 2 CI 2 

K 2 0s0 4 

HoC! 3 

Phasing statistics 

Derivative 

CH 3 HgN0 3 

Hg(CH 3 C00) 2 

c/s-Pt(NH 3 ) 2 CI 2 

K 2 0s0 4 

HoCU 



Method of 




Number of 


Resolution 


Number of 


Unique reflections 




collection 




crystals 


(A) 


measurements 


(% completeness) 


"merge 


image plate 




8 


2.5 


121.767 


27.727 (100) 


0.108 


film 




7 


2.5 


103.523 


n7 7C7 M A/M 


u.uyo 


film 




5 


2.5 


60224 


25919 (94 5) 


0.103 


film 




7 


2.5 


86.629 


25924 (94 5) 


0.107 


FAST 




1 


4.5 


21.143 


4 680 (100) 


0.077 


FAST 




1 


4.5 




4 701 (100) 


0.069 














Phasing powers 


Anomalous data Number of sites 






(resolution. A) 




no 




3 


0.183 


0.715 


1.56 (2.5) 




yes 




6 


0.247 


0.609 


2.28 (2.5) 




no 




5 


0.185 


0.682 


1.54 (2.5) 




no 




4 


0.149 


0.757 


1.26 (5.5) 




no 




3 


0.095 


0.741 


1.35 (5.0) 



Protein preparation: Solubilized parasporal crystals from B.t. subsp tenebrionis were incubated at O.Smgml" 1 protein with 0.125 units per ml of 
Agarose-linked papain (Boehringer) in 3.3 M NaBr. 0.05 M sodium phosphate. pH7.0. and 0.1 mgml 1 phenylmethylsulphonylfluoride (PMSF) for 30min at 
20 °C. Digestion was stopped by adding tosyl lysinechloromethylketone (TLCK) to 0.125 mgm 1 and Na 2 C0 3 to one fifth volume and removing the 
enzyme-beads. The 67K beetle toxin was then purified by gel filtration on Sephadex G75 equilibrated with 0.1 M NaHC0 3 . pH 10.5, 0.5 M NaBr. Crystallization: 
Single crystals were obtained by microdialysis at a protein concentration of 2.5 mg ml -1 against 0.1 M NaHC0 3 . pH 9.5, 1.2 M NaBr at 4 °C overnight, then 
against 0.1 M NaHC0 3 . pH9.2, 0.5M NaBr at 16°C; 3mM NaN 3 , 0.1 mM PMSF and 0.1 mgml" 1 TLCK were present in all buffers. Crystals were transferred 
by stages to 0.05 M 2-(/V-morpholino)ethanesuIphonic acid (MES), pH6.5. for derivative preparation and mounted in 0.03% low-melting agarose in this buffer 
during data collection. Data collection: Image plate and film data were processed using MOSFLM (Imperial College, London) and CCP4 programs (Daresbury. 
UK). FAST (ref . 20) data were collected and processed with MADNES 45 , and scaled in 3° batches. Derivatives: Crystals were soaked respectively in 0.25 mM 
CH 3 HgN0 3 for 3.5 h, in 1 mM Hg(CH 3 C00) 2 for 14 h. in freshly prepared 1 mM cis -Pt(NH 3 ) 2 CI 2 for 21 h. in saturated K 2 0s0 4 for 35 h, and in 2 mM HoCI 3 for 
3 days. Phase calculation: Two heavy-atom sites in each derivative were located from difference Patterson functions, except in the case of Hg(CH 3 C00) 2 
for which 3 sites were located, and the remaining sites were found by cross-phased difference Fouriers. Heavy-atom parameters were refined against 
centric data and phases calculated for all data using the program PHARE (G. Bricogne). The two low-resolution derivatives were refined against phases 
calculated from the high-resolution derivatives. Phasing with the three high-resolution derivatives gave an overall figure of merit of 0.61 (25-2.5 A) and a 
clearly interpretable map. Including the remaining derivatives slightly improved the connectivity of the map (overall figure of merit 0.63). and four cycles of 
solvent flattening using a 50% solvent content and a 9 A radius in mask calculation 21 27 improved the overall definition of densities. The starting model 
was built using the program 0 (ref. 23) with the Bones option for main-chain tracing and the autobuild and manip options for side chains. Refinement by 
simulated annealing using the program X-PLOR (ref. 24) reduced the fl-factor from 0.37 to 0.25 without individual B-factors. and to 0.23 with restrained 
individual 8-factors. The model was adjusted in the loops 154-156. 429-436. and 483-488, and had 40 solvent molecules added, then refined by X-PLOR 
again. The current model has an ff-factor of 19.9%, with r.m.s. bond length deviation of 0.017 A, r.m.s. bond angle deviation of 3.2°. and average atomic 
B-f actor of 18 A 2 . 

* flmer ge = I L !<0|. where I, are intensity measurements for a reflection, and </> is the mean intensity for this reflection. 

f flderiv=I I^PH-^pkl IfpL WR ere /> H is the structure factor amplitude of the derivative crystal and F P is that of the native. 

+ flcuiiis ~ 1 1 \ f ph ± fpf- fH<calc)|/V |fpH - Fp\. where F P and F PH are defined as for R^, 1V , and F„(calc) is the calculated heavy-atom structure factor amplitude 
summed over centric data only. 
§ Phasing power = (F H )/E, the r.m.s. heavy-atom structure factor amplitudes divided by the residual lack of closure error. 



bulky end of the molecule. Through their contact one of the 
two 0 sheets in domain III is almost entirely buried. To our 
knowledge (see, for example, ref. 25), the packing of helices in 
domain I and of sheets in domain 11 are both novel arrange- 
ments. 

Domain I. The central helix in this seven-helix bundle is a s (Fig. 
36,c), which is oriented with its C terminus towards the bulky 
end of the molecule. Viewed from this end, the outer helices 
are arranged anticlockwise in the order of a u a 2 , a 3 , a 4 , a 6 
and a 7 , with helices a, and a 7 adjacent to the /3-sheet domains; 
a 2 is interrupted by a non-helical section and only the leading 
half, a^, is packed against a s . Figure 3a shows the alignment 
of amino-acid sequence on the surfaces of the helices. The 
helices are long, especially a 3 to a 7 , which contain respectively 
8, 7, 6, 9 and 7 complete helical turns and hence would be long 
enough to span the 30-A thick hydrophobic region of a mem- 
brane bilayer. Furthermore, the six outer helices bear a strip of 
hydrophobic residues (defined by AG 2=0 for transfer from oil 
to water) down their entire length on the side-facing helix a 5 , 
so they are amphipathic. In keeping with the general observation 
that secondary structures are close-packed and bury hydro- 
phobic surfaces 26 , the helix contact angles in this domain cluster 
around +20° rather than -50°, giving the bundle a bouquet-like 
appearance (Fig. 36). Figure 3c shows the bundle in cross- 
section. The interhelical space contains 27 aromatic residues 
which are packed in the edge-to-face fashion 27 ; all polar groups 
in this region are hydrogen-bonded or in salt bridges. 
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The concentric arrangement of the seven-helix bundle is dis- 
tinct from the two-layered type seen in bacteriorhodopsin. There 
is some resemblance to the pore-forming domain of coiicin A 28 , 
in which two hydrophobic helices are shielded from solvent by 
eight amphiphilic helices, but the coiicin helices are generally 
shorter. Like the coiicin helices, the bundle in the beetle toxin 
may be a soluble form of packaging for the hydrophobic and 
amphiphilic helices that will form pores in the membrane after 
a large change in conformation. 

Domain II. In Fig. 4a and 4b the three sheets of this domain are 
laid side-by-side, as they would be seen from the solvent. There 
is an apparent structural duplication between the four-stranded 
antiparallel sheets, sheet 1 and sheet 2. The chain connections, 
04, 03, 02, 05 and 08 » 07 1 06, 09, respectively, follow the order 
of +3, - 1 , - 1 , +3, which is typical of the 'Greek-key' topology 29 . 
From both sheets the inner strands, /3 3 and 0 2 as well as p 7 and 
/3 6 , extend some 20 A to the apex of the molecule as two- 
stranded 0 ribbons; and at the point of departure from the 
sheets there is a P -bulge in /3 3 and in 0 7 to twist the plane of 
the ribbon by nearly 90° relative to the sheet. The connections 
between the outer strands cross over the ribbons on the solvent 
side. 

The pseudo-symmetry between these sheets is very approxi- 
mate. Using the least squares option in O (ref. 23), the sheet 
region of the strands 0 3 and /3 2 can be brought to superimpose 
on that of 0 7 and /3 6 , with a r.m.s. fit of 0.72 A for 13 a carbons. 
But the r.m.s. fit increased to 1.1 A for 23 a carbons of the 

NATURE • VOL 353 • 31 OCTOBER 1991 



(5i 1QQ1 Nature Puhlichinn fSrntin 



ARTICLES 



FIG. 1 Electron density map in the neighbourhood 
of Cys 243. calculated a, using combined phases 46 
from multiple isomorphous replacement and sol- 
vent flattening, and b, using combined experi- 
mental and model phases 46 after refinement by 
X-PLOR. The refined structure is shown superim- 
posed for reference. Although Cys 243 is a major 
site of both the methyl mercury (MM) and mercuric 
acetate (MA) derivatives, the methyl mercury site 
is in a hydrophobic enviironment compared with 
the mercuric acetate site. 




whole inner strands including the ribbon region, and 1.7 A for 
36 cr carbons on all four strands. Nonetheless, the sequence 
alignment brought by this superposition of the two sheets 
revealed a low level of internal homology, with seven pairs of 
equivalent residues (shown in bold) out of 41 aligned a carbons: 

338 HRIQPHTRPQP(6)SFNYWS{1)NYVSTRPSI(0)GSHDIITSPP{10)NLKFH 395 
402 AVAHTNLAVWP ( 0 ) SAVYSG { 1 ) TKVBPSQYN ( 3 ) DEASTQTYDS { 7 ) SWTS I 453 

The three-stranded sheet 3 is formed by two separate polypep- 
tide segments. The C-terminal segment of domain II contributes 
the two-stranded ribbon of p i0 and £ n , whereas the N-terminal 
segment of this domain contributes strand 0, , which is hydro- 
gen-bonded to p u ; 0, is followed by a two-turn helix a 8 and 
an extended chain. 

Figure 4c and d shows in side view and in cross-section that 
the three antiparallel sheets are packed around a triangular 
hydrophobic core. This brings the strand p x0 on the edge of 
sheet 3 into proximity with strand p 4 on the edge of sheet 1, as 
well as placing the loops at the end of the three p ribbons into 
a region of about 12 A radius at the molecular apex. This domain 
is in contact with helix a 7 of domain I on the face of sheet 3 
(Fig. 4c). 

Domain III. Figure 5 is a ribbon drawing of the strands forming 
the two sheets of the p sandwich. The sheet containing the 
C-terminal strand is in contact with domain I and will be called 
the inner sheet. This domain has the 4 jelly-roir topology 29 , 
because it can be generated by folding an antiparallel P ribbon 
which starts with p l3 (N terminus) and £23 (C terminus) on the 
inner sheet, and ends in the loop between /3 18 and p i9 on the 
outer sheet; p X4 is a short excursion from this ribbon and forms 
the fifth antiparallel strand of the outer sheet. In addition, small 
parallel sheets are formed at the edge of the p sandwich through 
hydrogen bonding of strand p l2 to 0 16 at the edge of the outer 
sheet, and p x to 0, 3 at the edge of the inner sheet. 
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Distribution of conserved sequences. The core of the beetle toxin 
molecule encompassing the domain interfaces is built from the 
five sequence blocks that are highly conserved throughout the 
5-endotoxin family 1 (Fig. 2f>,c). Block 1, located in the beetle 
toxin sequence at residues 189-218, corresponds to the central 
helix (a 5 ) of the bundle in domain I. Block 2, residues 239-305, 
overlaps with the latter half of a 6 , and with or 7 and the 
latter hydrogen-bonds to the edge of the inner sheet in domain 
III before forming part of the three-stranded sheet 3 in domain 
II. Block 3, residues 491-538, overlaps with the latter part of 
P ll9 where it is hydrogen-bonded to p lt and with the loops 
connecting domains II and III. The remainder of block 3 
together with blocks 4 and 5, namely residues 560-569 and 633 
to the C terminus, respectively, constitute the three buried 
strands of the inner antiparallel sheet in domain III. The high 
degree of conservation of internal residues implies that 
homologous proteins would adopt a similar fold. Using the 
beetle toxin structure as a model, we can therefore propose a 
basis for the insecticidal activity of 5-endotoxins as a family. 

Basis of insecticidal function 

Solubility. The beetle toxin crystals are isomorphous with the 
parasporal crystals 18,19 and show the molecular contacts respon- 
sible for solubility behaviour in vivo. Four intermolecular salt 
bridges, Asp 142-Arg 165, Asp 224-Arg 562, Asp 590- Arg 178, 
and Glu 223- Lys 293, are located at contacts to three different 
neighbouring molecules. Such salt bridges keep the protoxin 
crystals insoluble until exposed to the extreme pHs in the insect 
midgut. 

Proteolytic activation. Pro-6-endotoxins have M r s of either 
~130K or ~70K. Activation by larval gut proteases removes 
the C-terminal half of the larger protoxins 30,31 and cleaves them 
at residue 28 or 29 from the N terminus. The smaller protoxins, 
such as that of the beetle toxin, are processed only at the N 
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FIG. 2 Overview, a, Schematic ribbon 
representation of the beetle toxin 
showing the domain organization. 
Secondary structure assignments are 
given by Yasspa within program 0 (ref. 
23). The polypeptide pathway is indi- 
cated by colouring the chain in the rain- 
bow order, from red at the N terminus 
to blue at the C terminus. The three 
domains are: I, a seven-helix bundle 
(upper left); II, a three-sheet assembly 
(bottom); and III, a £ sandwich (upper 
right). This and all following illustrations 
of the structure are made with the 
program MOLSCRIPT 47 . b and c, Ca 
trace (stereoview) of the molecule with 
the five conserved sequence blocks 
indicated by small beads at their Ca 
positions. In b the view is as in a, and 
in c it is down the central helix of the 
bundle from the bulky end of the 
molecule; c shows that the central helix 
of domain I and the inner sheet of 
domain III are conserved; b shows that 
the helices at the domain Ml interface 
and the loops at the domain 11—111 inter- 
face are also conserved. Note in c the 
helix packing of six around one in 
domain I. d, The solvent channel in the 
C222 x lattice viewed along the c axis. 
One half of the unit cell thickness is 
shown, containing four molecules. The 
other half of the cell is related to this 
by a two-fold rotation about horizontal 
axes (blue lines) at (§,y, ±|). The stack- 
ing of both layers leaves solvent chan- 
nels that traverse the cell along the c 
direction. The N terminus of the 
molecule (arrow) is accessible from 
these channels. 
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FIG. 3 The seven-helix bundle, a. Helical nets 
showing the position of amino-acid residues along 
the 7 helices: a 1 (63-79); a 2 {a 2a , 85-98 and 
a 2b . 104-117). o- 3 (123-152), a 4 (160-185). a 5 
(193-214), a 6 (222-254) and a 7 (259-285). The 
cylindrical surface of the helices are cut longi- 
tudinally on the side facing the solvent and flat- 
tened to give a view from the interior of the bundle. 
The top of the drawing corresponds to the bulky 
end of the whole molecule. Owing to tilting of the 
outer helices, different helices are in register verti- 
cally only at a level indicated by two arrows pointed 
at a t and a 7 \ a 5 is the central helix. Ootted curves 
outline the strip of hydrophobic residues down the 
inward surface of the other six helices. 0. Cot trace 
(stereoview) for the bundle viewed perpendicular 
to or 5 . The relative tilt of the outer helices to a 5 
and that between adjacent outer heleices are both 
about 20°. The Ca trace is shaded grey over 
helices ol to a3 in the back, striped over helix 
a 5 in the centre, and white over helices a 4. a 5, 
and a 7 in the front, c. Cross-section of the bundle 
at the level indicated by the arrows in a. viewed 
from the bulky end of the molecule. The hellical 
backbone is represented by curly ribbons passing 
through the Cor positions. The outer helices are 
positioned roughly hexagonally around the central 
one and tilted relative to it. so the bundle forms 
a left-handed superhelix. The aromatic side chains 
are packed in an edge-to-face fashion. Hydrogen 
bonds are shown for side -chain atoms. 





terminus 19,32 where about 50 residues are removed. The activated 
5-endotoxins show a conserved C-terminus, so-called sequence 
block 5 (ref. 1). Its position as the middle strand of the buried 
P sheet in domain III precludes further processing from the C 
terminus. In fact deletion from this site by 4 to 8 residues results 
in inactive mutants with altered solubility and immuno- 
genicity 30,33-35 . This is not surprising as the inner sheet can be 
expected to play a critical part in the structural integrity and 
stability of the toxins through interaction with the helical bundle. 

At the N-terminal cleavage sites the different protoxin 
sequences show locally similar hydropathy profiles 36,37 , which 
would be consistent with a common topology for the N-terminal 
region of the activated toxins as seen in the helical bundle of 
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the beetle toxin. In crystals of the beetle toxin, the N terminus 
at the start of helix <x x borders on a large solvent channel of 
about 30 A diameter that crosses the unit cell along the c direc- 
tion (Fig. 2d). This channel could allow access of sporulation- 
associated proteases to the cleavage site in parasporal crystals' 9 . 
Receptor binding. The insecticidal selectivity of S-endotoxins is 
due to high-affinity binding to specific membrane receptors 7 " 9,38 , 
which in three cases seem to be glycoproteins 38 " 40 . For several 
5-endotoxins the specificity-determining regions have been 
delimited by exchanging sequence segments between closely 
related toxins of differing specificities 13 " 15 . Guided by the loca- 
tion of secondary structures in the beetle toxin, a plaus- 
ible alignment of 5-endotoxin sequences was made for the non- 
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FIG. 4 The three sheets of domain II. a, Schematic 
ribbon drawing of sheets 1. 2. and 3 laid side-by- 
side. Each is viewed from the exterior of the 
domain. Note the Greek-key topology of sheets 1 
and 2 and the similarity between their fold. b. 
Hydrogen bonding of the polypeptide backbone for 
the three sheets. The /3 strands are shown by the 
main-chain atoms and by the residue numbers at 
their ends; connecting strands are shown as coils. 
C Co trace of the three assembled sheets in 
domain II viewed towards domain I (stereoview). 
The Cor trace is shaded grey over sheet I, striped 
over sheet II, and white over sheet III. d, Cross- 
section of domain II (stereoview) showing the 
packing of three sheets in a triangle around the 
hydrophobic core. The view is towards domain III. 



conserved regions (ref. 12, and T. C. Hodgman, unpublished 
results). Hence the genetically identified specificity-determining 
regions can be mapped to equivalent positions in the beetle 
toxin structure, and these fall mainly in domain II. For instance, 
the dual specificity of Cry II A for Lepidoptera and Diptera, as 
distinct from the Lepidoptera specificity in the closely related 
CryllB, is determined by residues 307-382 of their sequences 14 , 
which corresponds roughly to sheet 1 (Fig. 4a) plus strand /3 6 
in sheet 2 and the loop leading up to p 7 , whereas the Lepidoptera 
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specificity of CryllB is dependent on a longer segment 14 that 
would include both inner strands of sheet 2. Similarly, the 
toxicities of CrylA(a) and CrylA(c) to two lepidopteran insects 
depend on three segments termed x, y and z (ref. 15): amino-acid 
substitutions in y can reduce toxicity by up to 2,000-fold, and 
segments x and y interact in determining specificity. Aligned 
with the beetle toxin structure, segment x corresponds roughly 
to the outer strands p A and f$ s of sheet 1 and the whole of sheet 
2, including the loop entering £ 10 in sheet 3; y corresponds to 
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Domain III 

FIG. 5 Domain III, schematic ribbon representation of the p sandwich. p 
strands forming the inner sheet are shaded grey. The topology of an 
eight-stranded 'jelly-roll' can be seen by following the p hairpin starting 
with p 13 , p lb and p 23 in the inner sheet, continuing to p 16 and p 22 in the 
outer sheet, then p 17 and p 21 , fa in the inner sheet, and ending with P 1S 
and fa in the outer sheet. /3 14 is an excursion from the hairpin and forms 
a fifth antiparallel strand of the outer sheet. Small parallel p sheets are 
added to one edge of the p sandwich, by hydrogen bonding of P x to p l3 
in the inner sheet and p 12 to p 16 in the outer sheet. Residue numbers in 
the P strands are: p 12 , 502-506; p 13 , 509-513; 0 14 . 519-525; /3 15 , 
536-541; 0 16 . 547-554; p 17 , 558-569; p 1B . 573-579; P 19 , 585-591; 
020. 604-609; p 2l , 611-614; P 22 , 619-625; and p 23 . 631-643. 

strand p l0 of sheet 3 and the loop connecting /3 I0 and p u ; and 
z extends from p u totheC-terminal activation site. Furthermore, 
the interaction between x and y can be understood in terms of 
the proximity between p 4 on the edge of sheet 1 and /3 l0 on the 



edge of sheet 3. Although z was inferred 15 to extend into 
domain III, the combined evidence from genetics and receptor- 
binding assays in vitro for Lepidoptera toxins 9 * 41 correlates 
receptor recognition with sequence variations within domain 
II. We note that the p ribbons from all three sheets terminate 
in loops in a small region on the molecular apex, in a man- 
ner reminiscent of the complementarity-determining region of 
immunoglobins. 

Pore formation. The common mechanism of epithelial cell disrup- 
tion by 5-endotoxins of widely different specificities is believed 
to be the formation of lytic pores of 10 to 20 A diameter in the 
insect membrane 10 . The structure of the beetle toxin displays 
an apparatus for pore formation in the long, hydrophobic and 
amphipathic helices of domain I which could penetrate the 
membrane. Between the crystal structure in which the bouquet- 
like helical bundle internalizes all the hydrophobic surfaces, 
and the unknown pore structure where hydrophobic surfaces 
would be in intimate contact with the membrane lipids, large 
conformation changes must occur. In the absence of a full 
characterization of the pore-forming process, we propose the 
following by extrapolation from the crystal structure. 

The trigger for the conformational changes may be provided 
by receptor binding and the consequent interaction of toxin 
with the membrane bilayer. Membrane insertion follows rapidly, 
so that a major part of the bound 5-endotoxin cannot be dis- 
placed from the brush-border vesicles by other toxins recogniz- 
ing the same receptor sites 7,9 . As domain II and probably its 
apical region are most likely to bind the membrane receptors, 
the helices are expected to insert with the 'domain II end' (see 
Fig. 2a) oriented towards the cytoplasm. If helical hairpins are 
to initiate the membrane penetration, as probably happens for 
colicin 28,42,43 , they will probably be linked at the domain II end. 
So either of the helix pairs a 6 -a 7 or a 4 -a 5 could be the likely 
initiator. The a 6 -a 7 pair is favoured because it forms part of 
the conserved interface with domain II and is well positioned 
to sense the receptor binding. On the other hand, helix a s is 
the most conserved throughout the family of 8 -endotoxins. Point 
mutations in a s reduce toxicity of a Lepidoptera toxin without 
reducing binding to membranes 44 . Proteolysis in the interhelical 
loops at the domain III end, as in the a 3 -a 4 loop 19,32 , may 
facilitate release of the helix pairs from the tertiary structure of 
the bundle. The insertion of a hairpin can create a defect in the 
membrane, allowing the rest of domain I to participate in pore 
formation in a cooperative manner. □ 
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Summary 

Background: Genetically modified (QM) crops that ex- 
press insecticidal protein toxins are an integral part of 
modem agriculture. Proteins produced by Bacillus thur- 
ingiensis (Bt) during sporulation mediate the pathoge- 
nicity of Bt toward a spectrum of insect larvae whose 
breadth depends upon the Bt strain. These transmem- 
brane channel-forming toxins are stored in Bt as crystal- 
line inclusions called Cry proteins. These proteins are 
the active agents used in the majority of biorational 
pesticides and insect-resistant transgenic crops. Though 
Bt toxins are promising as a crop protection alternative 
and are ecologically friendlier than synthetic organic 
pesticides, resistance to Bt toxins by insects is recog- 
nized as a potential limitation to their application. 

Results: We have determined the 2.2 A crystal structure 
of the Cry2Aa protoxin by multiple isomorphous replace- 
ment. This is the first crystal structure of a Cry toxin 
specific to Diptera (mosquitoes and flies) and the first 
structure of a Cry toxin with high activity against larvae 
from two insect orders, Lepidoptera (moths and butter- 
flies) and Diptera. Cry2Aa also provides the first struc- 
ture of the proregion of a Cry toxin that is cleaved to 
generate the membrane-active toxin in the larval gut. 

Conclusions: The crystal structure of Cry2Aa reported 
here, together with chimeric-scanning and domain- 
swapping mutagenesis, defines the putative receptor 
binding epitope on the toxin and so may allow for alter- 
ation of specificity to combat resistance or to minimize 
collateral effects on nontarget species. The putative re- 
ceptor binding epitope of Cry2Aa identified in this study 
differs from that inferred from previous structural studies 
of other Cry toxins. 

Introduction 

The almost 20 million hectares of GM crop fields in North 
America consist of crops engineered for herbicide or 
insect resistance. The genes that confer the latter trait 
come from Bacillus thuringiensis (Bt), a family of Gram- 
positive sporulating soil bacteria that produce para- 
sporal crystals with insecticidal activity. The insecticidal 
activity of particular Bt isolates is directed against nar- 
row spectra of insect larval species, usually within a 
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4 Present address: Maxygen, Redwood City, California, 94063. 



single order. Bacterial toxins known as insecticidal crys- 
tal proteins (ICPs) or crystalline (Cry) proteins that are 
sequestered as protoxins in crystalline inclusions after 
sporulation mediate this species-specific pathogenicity 
[1]. The Cry protoxins are ingested, solubilized in the 
larval gut [2, 3], and activated by the removal of an 
amino-terminal segment and a C-terminal segment, the 
size of which depends on the gene or its protoxin [2, 4]. 
The active toxins associate with insect-specific recep- 
tors of gut epithelial cells of the target insect [5] and 
subsequently insert into the cell membrane [6, 7], lead- 
ing to the formation of ion channels [8, 9, 1 0]. This results 
in disruption of the electrochemical balance across the 
basal membrane, gut paralysis, and larval death [11,12, 
13, 14]. The host cadaver serves as growth medium 
for vegetative cells arising from germination of the Bt 
endospores. 

Species selectivity of Cry proteins is encoded in the 
binding site for the target receptor [5]. Classification of 
the Cry proteins is based on amino acid sequence iden- 
tity [15] and is roughly correlated with the taxonomic 
order of susceptible insect species, spanning species 
of agricultural (Cry1 Lepidoptera, Cry2 Lepidoptera, and 
Cry3 Coleoptera) and public health (Cry2 and Cry4 Dip- 
tera) significance. The structure may help guide muta- 
genesis followed by screening that is directed toward 
the fine tuning of species selectivity in order to design 
insecticides that do not kill nontarget organisms such 
as monarch larvae [1 6]. It also may assist in the elucida- 
tion of the structural basis of resistance to Bt toxins and 
the subsequent generation of novel insecticidal toxins 
for use on Bt-resistant insects [17, 18]. 

Structure-based protein engineering of Cry toxins 
may direct the search for variants with broader suscepti- 
ble species spectra, optimal potency, and stability prop- 
erties. Cry2Aa is among an unusual subset of Cry pro- 
teins possessing broad insect species specificity by 
exhibiting high specific activity against two insect or- 
ders, Lepidoptera and Diptera [19, 20]. It is lethal to 
more lepidopteran species than the Cry1 toxins de- 
ployed against agriculturally important Lepidoptera [21] 
and exhibits a low level of crossresistance in CrylA- 
resistant insects [22]. Also, the mode of action of Cry2Aa 
may be distinct from that of other Cry toxins [23]. Thus, 
it could serve as a platform for the design of Cry toxins 
with broader susceptible species spectra and minimal 
Cryl A-derived crossresistance in the field. 

Chimeric-scanning mutagenesis experiments have 
identified disjoint blocks (D and L, see Results and Dis- 
cussion) of the Cry2Aa sequence that separately confer 
specificity against dipteran (D) and lepidopteran (L) spe- 
cies [24, 25]. These experiments also demonstrate that 
maximal activity against lepidopteran species requires 
not only L block residues but also some of the specificity 
determinants of the D residue block. Further, Cry2Ab, 
an 87% sequence identical homolog of Cry2Aa, has 

Key words: Bacillus thuringiensis; delta-endotoxin; Cry2Aa; binding 
epitope; crystal structure; X-ray 
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Table 1 . Data Collection, MIR Phasing, and Structure Refinement Statistics for Cry2Aa 




Native 


U(NQ3) 


PtCNS 


PtU 


NbCI 2 


Ru 1 


Hg 2 


Data Collection (1.08 A) 


Resolution (A) 


2.2 


2.6 


2.5 


2.6 


2.6 


2.5 


2.5 


Unit cell dimensions (A) 


a = b = 85.6 
















c = 163.9 














Space group 


P4 3 2,2 














Number of observed reflections (<r F - 2.5) 


245,580 


69,703 


139,057 


70,618 


73,949 


113,242 


126,930 


Number of unique reflections 


31,591 


17,370 


20,476 


17,999 


17,455 


19,475 


20,198 


Completeness (%) 


99.3 


89.1 


99.0 


92. 


89.3 


94.7 


97.0 


Rm*g. (%) 


5.7 


5.1 


5.7 


4.7 


4.3 


5.4 


6.3 


Phasing/MIR 


Resolution 




2.6 


2.6 


2.6 


4.5 


2.6 


5.25 


Number of sites refined 




5 


6 


7 


6 


5 


3 


Number of reflections (o> 3) 




16,177 


17,808 


16,516 


3,136 


17,074 


2,419 


R*> (%) 




16 


24 


32 


10 


10 


16 


RcuBi 




.62 


.62 


.59 


.60 


.67 


.62 


R Iff BUI 




.13 


.15 


.20 


.06 


.08 


.09 


Phasing power 




1.1 


1.9 


1.8 


1.4 


0.8 


1.2 


<FOM> C8frtr1c 3 




.36 


.39 


.41 


.41 


.30 


.42 



<FOM> owan (n phMCd ) < .65 (18,677) 

Refinement 

Resolution (A) 28.0-2.2 
Number of reflections (completeness %) 31 ,509 (93) 

Rcfy«t [o* = 0] (2.3-2.2 A) .18 (.21) 

Rt™. [5% test] (2.3-2.2 A) .24 (.23) 

Number of non hydrogen atoms 5,001 

Number of water molecules 51 4 

Rms bond distances (A) .005 

Rms bond angles (°) 1 .2 

1 [RutNH^Ct;,. 

2 Para chloromercuri phenol (PC MP). 

3 Individual data set results. 

4 Final number of phased reflections. 



negligible activity against dipteran species and 3- to 
8-fold less activity against certain lepidopteran species 
[25, 26]. Hence, Cry2Aa structure and mutagenesis data 
provide the basis for future protein engineering of Cry 
toxins with modified specificity and selectivity profiles. 

To understand the structural determinants of Cry toxin 
specificity, we determined the crystal structure of the 
protoxin of Cry2Aa from Bacillus thuringiensis subsp. 
kurstakL The complete structure was determined by 
multiple isomorphous replacement and refined to 2.2 A 
resolution. We have identified a candidate toxin receptor 
binding surface that is consistent with available chime- 
ric-scanning mutagenesis data. 

Results and Discussion 

The structure of Cry2Aa from Bacillus thuringiensis 
subsp. kurstaki was determined by multiple isomor- 
phous replacement using six heavy atom derivatives 
and was refined to 2.2 A resolution with R ciyst = 18% 
(Table 1). The structure of the 633-amino acid protoxin 
contains the N-terminal 49-amino acid peptide that is 
cleaved upon activation and the three domains of what 
will become the mature toxin [27]. The structures of the 
three domains are surprisingly similar in overall topology 
(Figure 1a) to those of the activated toxins Cry3Aa [28] 
and Cry1 Aa [29], suggesting that removal of the activa- 



tion peptide serves to expose regions of the toxin rather 
than alter its conformation. This structural homology is 
also surprising since these toxins have little sequence 
identity to Cry2Aa (20% and 17%, respectively). In the 
mature toxin, the N-terminal domain (residues 1-272) is 
a pore-forming seven-helical bundle (Figure 1d) [1]. The 
second domain (residues 273-473) is a receptor binding 
P prism, a three-fold symmetric arrangement of p 
sheets, each with a Greek key fold (Figure 1e). The third 
domain (residues 474-633) is implicated in determining 
both larval receptor binding [30, 31 ] and pore function 
[32] and is a lectin-like C-terminal p sandwich (Figure 1f). 

Available chimeric-scanning mutagenesis data [24, 
25] define a candidate toxin-receptor binding surface 
on Cry2Aa that is comprised of a distribution of hy- 
drophobic residues (Me474-AJa477 from pi2a, Val365- 
Leu369 from the p5-p6 loop, and Leu402-Leu404 from 
the p7-p8 loop) across the solvent-exposed surface of 
the p prism and p sandwich domains (Figure 1 b). Proteo- 
lytic activation of the toxin involves the removal of the 
49 N-terminal amino acids and exposes residues com- 
prising this putative toxin-receptor binding surface. Re- 
moval of the 49 amino terminal residues, comprised of 
a0, aOa, and an N-terminal coil, would not affect the 
structure of the seven-helical membrane insertion do- 
main, as seen by comparing the structures of the acti- 
vated toxin CrylAa and that of the protoxin Cry2Aa. 
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Figure 1 . Topology and Solvent Accessible 
Surface of Cry2Aa 

(a) Ribbon diagram, rendered by Midas Plus 
[48], of Cry2Aa. Domain I is shown in ma- 
genta, domain II is shown in blue, and domain 
III is shown in cyan. The N terminus is shown 
in red, while functionally important loops de- 
limiting the putative toxin-receptor binding 
epitope are shown in green. A Cry2Aa inser- 
tion, relative to Cry3Aa and CrylAa, before 
p1 2 at the N terminus of domain III is shown 
in magenta. Numbered p strands referred to 
in the text are labeled. 

(b) The solvent accessible surface, as calcu- 
lated by GRASP [49], of domains II and III of 
Cry2Aa. The orientation is identical to that 
shown in Figure 1a. The projection of residue 
hydrophobicity onto this surface is shown in 
color. Portions of the hydrophobic surface 
contributed by residues 474, 476, and 477 are 
shown in cyan, those contributed by residues 
365-369 are shown in blue, those contributed 
by residues 402 and 404 are shown in ma- 
genta, and the remainder of the surface con- 
tributed by hydrophobic residues is shown in 
yellow. The remaining surface that is identi- 
fied as nonhydrophobic is colored white. Res- 



idue hydrophobicity is as defined by GRASP [49]. The prominent hydrophobic patch is the center of the putative toxin-receptor binding 
epitope. For orientation, the portion of the surface contributed by residue 357 of the ^4-p5 loop is shown in red. 

(c) Trie solvent accessible surface (as calculated by GRASP) of domains II and III of Cry2Aa. The orientation is identical to that shown in 
Figure 1 a. The projection of residue hydrophobicity onto this surface is shown in yellow, while the N terminus is shown in red; the N terminus 
sterically hinders access to the putative toxin-receptor binding epitope. Portions of the surface that are identified as nonhydrophobic are 
colored white. 

(d-f) The three domains of Cry2Aa shown in the same orientation as in Figure 1a. Labels with amino acid numbers identify the visible N and 
C termini of each domain in the figures. 



This is also expected since constructs consisting of 
the N-terminal-helical domain of the complete Cry3Ba1 
protoxin (prior to cleavage) are capable of nonreceptor- 
mediated partitioning into lipid bi layers [33], as is the 
activated toxin. 

The structure of Cry2Aa suggests that the N-terminal 
residues should sterically hinder access to the putative 
binding epitope p5-p6 and p7-p8 loops (Figure 1a, 
shown in green) and the exposed parts of domain III 
closest to domain II. Projection of hydrophobicity onto 
the solvent accessible surface of domains II and III re- 
veals an 800 A 2 hydrophobic patch (Figure 1b) proximal 
to these loops. However, while the structure suggests 
that the 49 N-terminal residues (a0, aOa, and the N-ter- 
minal coil) should sterically hinder access to the putative 
binding epitope, the biological rationale for this function 
is unclear. It is unlikely that Bt possesses a receptor 
with affinity for the activated toxin. Hence, it does not 
seem likely that the N terminus serves to prevent prema- 
ture activation of the toxin within Bt. One simple expla- 
nation is that occlusion of the hydrophobic patch of the 
putative binding epitope prevents nonspecific aggrega- 
tion of the toxin with itself or other host proteins. Another 
explanation is that the N-terminal amino acids play a role 
in the formation of the environmentally stable crystalline 
inclusions. 

The specificity-distinguishing residues are also indi- 
cated by comparison of the Cry2Aa structure with the 
structure of the highly homologous (87% sequence iden- 
tity) Cry2Ab that is inactive against some Cry2Aa target 



species (Figure 2a). Chimeric-scanning mutagenesis 
[24, 25] defines a continuous 106 amino acid block, 
307-41 2, of specificity-distinguishing residues. (Specifi- 
cally, [25] demonstrated that substitution of residues 
278-340 resulted in loss of dipteran-specific activity in 
Cry2Aa, while [24] demonstrated that substitution of res- 
idues 307-382 conferred dipteran-specific activity to 
Cry2Ab. Thus, in our discussion, we adopt residue 307 
as the N-terminal boundary of the specificity-conferring 
sequence in Cry2Aa.) Within these 106 amino acids, 
there are 23 residues that differ between Cry2Aa and 
Cry2Ab (sequence alignment presented in Figure 5). 
Most of the Cry2Aa-Cry2Ab amino acid differences lie 
within or about the domain ll/lll 800 A 2 hydrophobic 
patch (Figure 1 b) and surrounding residues from the (35- 
p6, p7-p8, and p4-p5 loops (Figure 1 a). The picture of 
the putative toxin-receptor binding surface that emerges 
is that of an 800 A 2 hydrophobic region surrounded by 
three loops, those joining p4-p5, p5-p6, and p7-p8, 
which are also a part of the putative binding site. The 
three loops contain hydrophilic side chains that may be 
involved in specific hydrogen bonding with the receptor 
and so signal a portion of the site that could be mutated 
both to probe these interactions and to alter specificity. 

The proximity of this surface to solvent-exposed loops 
of the lectin-like domain III is consistent with the finding 
that domain III plays a role in the fine tuning susceptibility 
of different species. This has been demonstrated by 
replacement of domain III [30, 31] to make chimeric 
toxins with altered specificity characteristics. The N-ter- 
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Figure 2. Space-Filling Representation of 
Cry2Aa Specificity-Conferring Residues, De- 
tail of Buried D Block Residues, and Electron 
Density Covering Buried D Block Residues 

(a) Space-filling model of Cry2Aa domains II 
and III with the N terminus and membrane- 
inserting domain I removed. The orientation 
reflects a -20° rotation relative to that shown 
in Figure 1 a. The results of chimeric-scanning 
[24, 25] mutagenesis experiments are pro- 
jected onto the van der Waats surface of 
Cry2Aa. The residues colored green and cyan 
are single amino acid differences between 
Cry2Aa and Cry2Ab in block L (residues 341 - 
41 2). The residues colored yellow and orange 
are single amino acid differences between 
Cry2Aa and Cry2Ab in block D (residues 307- 
340). The bar represents an approximate 1 0 A 
scale. 

(b) Packing of D block residues behind the 
(34-(J5 loop. The JJ4-p5 loop contains L block 
specificity determinants with which the bur- 
ied D block residues interact. 

(c) Electron density for the putative receptor 
binding site covering residues of the p sheet 
behind the p4-p5 loop. 



minal strand P12a of domain III is not present in the 
three-dimensional structures of Cry1 Aa or Cry3Aa. The 
turn between this strand and the rest of domain III is 
functionally replaced almost exactly by a loop that con- 
nects p3 and (J4 of domain II in the homologous Cry1 Aa 
and Cry3Aa structures (Figures 1a and 3, shown in ma- 




Figure 3. Detail of Ribbon Diagram Overlap of Cry2Aa and Cry1 Aa 
The Cry1 Aa domains have been independently fit to those of Cry2Aa. 
The functionally important loops delimiting the putative toxin-recep- 
tor binding epitope are shown in green (Cry2Aa) and blue (Cry1 Aa). 
The Cry2Aa insertion, relative to Cry3Aa and CrylAa, before p12 
at the N terminus of domain III is shown in magenta, while the 
corresponding loop from CrylAa is shown in cyan (see arrow). 



genta). This functionally conserved pi 2a motif occupies 
the same region of the structure as the p3-£4 turn in 
CrylAa and Cry3Aa, so it implies conservation of a func- 
tional role in protecting the hydrophobic portion of the 
putative receptor binding surface implicated by the ho- 
molog substitutions. 
Chimeric-scanning mutagenesis identifies fairly large 
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Figure 4. Schematic Representation of Chimeric-Scanning Muta- 
genesis Data 

The first and last bands represent the Cry2Aa and Cry2Ab se- 
quences, respectively. The middle bands represent chimeric combi- 
nations in which gray regions correspond to Cry2Ab sequence and 
white regions correspond to Cry2Aa sequence. For all bands, except 
that corresponding to Hyb51 3, the three central vertical bars repre- 
sent amino acids 278, 340, and 41 2. For Hyb51 3, the two central 
vertical bars represent amino acids 307 and 382. The activity desig- 
nations represent an approximate log scale. For example, the (+ + +) 
activity designation for chimera DL1 1 2 corresponds to an IDg, of 
126 (85.7-187) ng, while the (+) designation for chimera DL1 15 cor- 
responds to an ID50 of 3,200 (1 ,340-51 ,900) ng; the confidence inter- 
vals correspond to 2a. 
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Figure 5. Detail Sequence Alignment of 
Cry2Aa and Cry2Ab 

Sequence alignment of the D and L block 
regions of Cry2Aa and Cry2Ab generated us- 
ing ALSCRIPT [51]. In the alignment, identical 
amino acids are unmarked; similar residues 
(as defined by ALSCRIPT) are colored yellow, 
while dissimilar residues are marked green. 
The secondary structure associated with se- 
quence is presented in the lowermost row. 
The block of secondary structure associated 
with D block residues is colored magenta, 
while that associated with L block residues 
is colored cyan. 



regions of the protein sequence that confer differential 
specificity to Diptera and Lepidoptera [25] (Figure 4). In 
Figure 4, the first band represents the sequence of 
Cry2Aa with its high level of activity (+ + +) against both 
Lepidoptera and Diptera. The last band represents the 

Cry2Ab sequence that exhibits negligible activity ( ) 

against Diptera and up to one order of magnitude lower 
activity against Lepidoptera when compared with 
Cry2Aa. The second band (DL112) represents a Cry2Aa 
chimera that contains the Cry2Ab sequence for the 
block D residues 307-340 (dipteran-specific). This chi- 
mera has negligible activity against Diptera and is sug- 
gested to have reduced activity (at the 1a level) against 
Lepidoptera, indicating that block D correlates with dip- 
teran specificity. The activity profile of a reverse chimera 
(the third band) [24], in which Cry2Ab contains the block 



D sequence from Cry2Aa, shows a more significant re- 
duction than DL1 12 against Lepidoptera (of a different 
species) but is only reduced 20-fold toward Diptera ver- 
sus Cry2Aa. Thus antidipteran activity tracks with the D 
block of Cry2Aa. 

The fourth band (DL1 1 5) represents a Cry2Aa chimera 
that contains the Cry2Ab sequence for the dipteran- 
disfavoring D block and for a neighboring region of se- 
quence, the lepidopteran-disfavoring L block (residues 
341-412). The activity profile of this construct against 
both Diptera and Lepidoptera most closely parallels that 
of Cry2Ab, which is consistent with blocks D and L 
encoding essentially all of the differential specificity de- 
terminants. In summary, the differential specificity for 
Diptera in Cry2Aa depends on block D, while that for 
Lepidoptera depends on block L Maximal activity 



Table 2. Solvent Accessible Surface Areas, Contacts within 3.4 A, and Hydrogen Bonds for the Specificity-Conferring 
Residues in Cry2Aa 

Exposed Surface 

Residue Exposed Surface (A 2 ) Beyond C p (A 2 ) Contacts 



Dipteran Specificity-Conferring 



Ile307 


6 


4 


Ser309,Ser343,Gly481 1 (Met483),(Tyr342) 


Ser309 


26 


26 


Asn341 ,He307,Thr364,(Ser363) 


Ile311 


1 


0 


Cys362,(Arg339),(Asn361 ) 


Thr314 


7 


7 


Ser337,Asn357,Asn336,His358,(Asn359) 


Ile318 


91 


89 


Thr332,(Thr331) 


Gly324 


78 


0 




Ser334 


5 


5 


Leu31 6,Asn336,(Phe409),(Gln399),(Arg31 5) 


Asn336 


6 


6 


Thr314,Ser334,Ala460Ala353,{Gly313),(lle351) 


Ser337 


0 


0 


Thr314,A!a353,His358 


Lepidopteran Specificity-Conferring 


Val346 


39 


34 


Tyr342,{Asn303),(Gly344) 


Leu350 


27 


26 


Asn449,lle450 


Thr354 


50 


26 


Glu451 


Asn355 


109 


76 


(Pro457) 


Leu356 


60 


43 


(Ala353) 


His358 


43 


14 


Ser31 2,Thr314,Ser337,(Gly31 3) 


Val365 


107 


75 


(Asn336) 


Ser370 


68 


39 


Pro367,(His21) 


Thr382 


54 


24 


(Asn392),(Thr391) 


Ser390 


9 


9 


Ser329,Thr331,Asp383 


Gln399 


33 


33 


Val374,Arg375,Arg405,(Leu404) 


Ser403 


93 


73 




Cys406 


37 


27 


Ser397,Phe398,Cys362 


Ser410 


89 


72 





All data were calculated for the activated toxin using HBPLUS [50]. Entries in the left-most column are the 23 specificity-conferring residues. 
Entries in the right-most column conform to hydrogen bonding geometry, except for those enclosed in parentheses that are van der Waals 
contacts. Bold entries in the right-most column identify specificity-conferring residues also found in the left-most column. 
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Figure 6. Restriction Maps Detailing the Con- 
struction of Plasmid pSB307 

(a) Nucleotide sequences of the oligonucleo- 
tides GKP-6 and GKP-7. 

(b) Restriction maps of pSB302, pSB304, and 
PSB307. 



against Lepidoptera, as seen in Cry2Aa, still requires 
some contribution from block D (sequence alignment 
presented in Figure 5). 

Figure 2a projects the Cry2Aa/Cry2Ab homolog differ- 
ences onto the van der Waals surface of Cry2Aa (for 
clarity, only domains II and III are shown). In the D block, 
there are nine residues that differ between Cry2Aa and 
Cry2Ab. Surprisingly, most of these are buried. The nota- 
ble exceptions are Ile318 and Gly324 (Asn and Val, re- 
spectively, in Cry2Ab), which are distant from the puta- 
tive binding epitope, and the moderately exposed 
Ser309 (Asn in Cry2Ab) within the putative binding epi- 
tope (Table 2). Ile307 and Ile311 are found packed be- 
hind exposed residues on the putative binding surface. 
Almost half of the variant residues from block D (Thr31 4, 
Ser334, Asn336, and Ser337) are in a cluster that is 
packed behind the p4-p5 loop presented from within 
the 72-residue L block (Figures 2b and 2c). 

Two of these buried variant residues, Thr314 and 
Ser337, make side chain-main chain hydrogen bonds 
with the p4-p5 loop. A third residue, Asn336, makes 
main chain-main chain hydrogen bonds with the p4-p5 
loop, and Thr31 4 makes side chain-side chain hydrogen 
bonds with Ser334. In the less active homolog, Cry2Ab, 
these residues are replaced with approximately isosteric 
nonhydrogen bonding residues, suggesting that this 
pattern of substitutions abolishes affinity for the dipteran 
receptor (Thr314Ala, Ser334Ala, Asn336Leu, and Ser33- 
7 Ala). It is conceivable that the lle318Val and Gly324Val 
substitutions are part of a region of the protein that 
interacts only with the receptors) found in dipteran spe- 
cies and shares some components with the putative 
binding epitope that we identify. However, we speculate 
that the same exposed surface area binds to the lepi- 
dopteran and dipteran receptors. In this model, these 
solvent inaccessible residues behind the putative recep- 
tor binding surface may serve to alter the conformation 
of the p4-p5 loop, with its several hydrophilic specificity- 
determining residues. Similar modulation of specificity 
in protein-protein interactions by noncontact residues 
is seen in the context of immunoglobin residues that 
affect conformation of the complementarity-determin- 
ing residues (CDR) at the binding surface [34]. Likewise, 
affinity maturation of a Fab/antigen complex results in 
the optimization of antibody/antigen binding by residues 
15 A from the interaction surface [35]. 

The structures of Bt toxins provide a template for 
design and discovery of changes that alter receptor 
targeting in order to either broaden selectivity for better 
field efficacy, prolong the life of existing agents, or avoid 



unwanted effects on nontarget organisms. Resistance 
to Bt toxins is recognized as a potential limitation in 
their application. Early studies concluded that recessive 
genes controlled the inheritance of Bt resistance. How- 
ever, a recent study suggests that Bt resistance can be 
inherited as an incompletely dominant autosomal gene 
[36]. The authors note that such a mechanism of Bt 
resistance inheritance in the field would significantly 
reduce the usefulness of the high dose/refuge strategy 
of resistance management in which some mates are 
not challenged with toxin. Knowledge of any presumed 
modifications in the receptor that cause resistance can 
potentially instruct rational protein engineering of the 
receptor binding surface to yield toxins that might by- 
pass resistance and still bind to the modified receptor 
of resistant insect species. 

Potential collateral effects upon nontarget insect spe- 
cies [36] and effects upon nontarget predatory insects 
that consume target insect species [37] have been at- 
tributed to Bt GM crops. The structures provide a blue- 
print for focused mutagenesis followed by screening to 
select for each specific target species in a particular 
crop, so as to diminish collateral toxicity to nontarget 
species. By shedding light on the molecular basis of 
toxin-host receptor recognition, the structure provides 
a foundation for engineering Bt-based toxin genes that 
may develop broader insect species specificity, species 
selectivity tuned to reduce collateral impact upon non- 
target species, and longer field efficacy. 

Biological Implications 

We have determined the three-dimensional structure of 
the insecticidal toxin Cry2Aa in order to understand the 
structural determinants of toxin specificity. Genetically 
modified (GM) crops that express insecticidal protein 
toxins are an integral part of modem agriculture. Pro- 
teins normally produced by different strains of Bacillus 
thuringiensis (Bt) during sporulation mediate a species- 
specific pathogenicity of Bt toward insect larvae of the 
target species and are the active agents in the majority 
of biorational pesticides and insect-resistant transgenic 
crops. Though promising as a crop protection alterna- 
tive, problems exist with transgenic crops. Bt GM crops 
may pose a threat to nontarget insect species [16] or to 
nontarget predatory insects that consume target insect 
species [37]. In addition, resistance to Bt toxins is recog- 
nized as a potential limitation to their application that 
is ecologically friendlier than traditional organic pesti- 
cides. For instance, EPA approval of Bt GM maize was 
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contingent upon the establishment of viable resistance 
management strategies [36]. 

Cry2Aa is among an unusual subset of crystalline (Cry) 
proteins possessing broad insect species specificity by 
exhibiting high specific activity against larvae from two 
insect orders, Lepidoptera and Diptera [24, 25], of ag- 
ricultural and public health significance. Also, the 
Cry2Aa protoxin is significantly smaller (72 kDa) than 
those of the Cry1 proteins (~1 35 kDa) in the current 
generation of transgenic crops. Since gene size can be a 
limiting factor of protein expression in plants, transgenic 
constructs based upon Cry1 usually express a smaller 
portion of the gene that contains essentially the acti- 
vated toxin. Cry protoxins are presumed to be more 
environmentally stable than the activated toxins; hence, 
transgenic constructs that express the Cry2Aa protoxin 
could deliver higher toxin doses in the field due to 
greater stability [22]. Also, expression of the protoxin 
reduces collateral damage to nontarget insect species 
since it depends on specificity of the host proteases for 
activation [3, 37]. Chloroplast-directed overexpression 
of the Cry2Aa protoxin has been demonstrated and 
shows expression levels equivalent to 2%-3% total sol- 
uble protein in transformed leaves [22]. Such high levels 
of expression, 20- to 30-fold higher than current nuclear 
transgenics, could diminish the opportunity for devel- 
oping resistance by significantly increasing toxin dose 
at the initial encounter with the insect. 

Cry2Ab, an 87% sequence identical homolog of 
Cry2Aa, has negligible activity against dipteran species 
and 3- to 8-fold less activity against certain lepidopteran 
species [25, 26]. Also, there exists a unique body of 
chimeric-scanning mutagenesis data in the Cry2Aa/ 
Cry2Ab system that has identified determinants of spe- 
cies specificity in the amino acid sequence [24, 25]. 
Con-elating the structure with chimeric-scanning data 
indicates that the putative receptor binding epitope of 
Cry2Aa lies on the core 3 sheet and differs from the end 
of the (3 sheet apical loops of domain II, as suggested 
from structures of the other Cry toxins [28, 29]. Thus, a 
target surface is defined for directed mutagenesis that 
may focus engineering of the toxin either to develop 
broader insect species specificity, species selectivity 
tuned to reduce collateral impact upon nontarget spe- 
cies, or longer field efficacy. Until now, the search for 
new insecticidal bacterial toxins involved collection and 
assay of novel isolates of Bt and other bacteria known 
to have insecticidal activity. Recent reports describe the 
isolation of bacterial species that produce new classes 
of insecticidal toxin [38]. These structure data may per- 
mit rational engineering of insecticidal Cry toxins with 
desired characteristics. 

Experimental Procedures 
Cloning of Cry2Aa 

Oligonucleotide primers flanking the coding region of cry2Aa were 
generated based on the published sequence of the gene from Bt 
kurstaki strain HD-1 [26], Primer GKP-6 is a 29-mer that corresponds 
to the N-terminal 26 nucleotides of the coding region (Figure 6a). 
Primer GKP-7 is a 25-mer that corresponds to a fragment overlap- 
ping the Hindll I site that is located ~350 nucleotides downstream 
from the stop codon (Figure 6a). Plasmid DNA isolated from Bt 
kurstaki HD-1 served as a template for the PCR reaction. The re- 



sulting 21 00 bp fragment was purified and served as the probe used 
to identify the Cry2Aa operon with its accompanying open reading 
frames. The hexamer-primed labeling method was used to incorpo- 
rate M P-dCTP into the probe. 

Previously, it was indicated that the entire gene, including the 
coding region and the promoter, is present on a 5.0 kb Hindlll frag- 
ment [26] of a plasmid isolated from Bt kurstaki HD-1 . The 3.5-7 kb 
fragments obtained by Hindlll digestion of plasmid DNA isolated 
from Bt kurstaki HD-1 were ligated into an E coli cloning vector, 
pTZ1 8R (Pharmacia, vecbase accession #VB0071 ) and were used 
to transform E coli DH5 cells by electro poration. Electroporated 
DH5 cells were plated onto LB-Amp 50 plates containing X-gal and 
IPTG for color selection. The presence of the cry2Aa gene in the 
transformed colonies (white) was confirmed by hybridization of the 
PCR-generated probe. Restriction analysis was used to confirm 
that the clones contained inserts with the cry2Aa gene and also to 
establish the orientation with which the fragment was inserted into 
pTZ1 8R. The results of this analysis revealed that one of the clones 
corresponded to the orientation designated pSB302 (Figure 6b), 
while two clones had the opposite orientation and were designated 
pSB303. pSB304, obtained by deleting the 1.2 kb-BamHI fragment 
(dotted line in Figure 6b), was also transformed into DH5. 

Total protein analysis for proteins produced by E coli strain DH5 
carrying pSB302, pSB303, or pSB304 was performed by SDS-PAGE. 
A protein band of molecular weight 62 kDa, absent in the original 
DH5 cells, was observed in all of the clones examined. The level of 
expression was the highest in those cells bearing pSB304. Most of 
the toxin could be found in the peiletable fraction following sonica- 
tion of the cells. Samples were evaluated for biological activity by 
bioassay using Manduca sexta as the target insect. All of the clones 
(pSB302, pSB303, and PSB304) were active with LD W values of 
~500 ppm. 

The pSB304 plasmid retains a unique EcoR1 site, ~200 nucleo- 
tides upstream of the cry2Aa promoter, into which the EcoR1 -linear- 
ized Baciilus cereus vector pBC16.1 (GenBank accession number 
U32369) was cloned (Figure 6b). The resulting clone was used to 
transform E. coli DH5, and clones containing the new plasmid were 
designated pSB307. Confirmation of the identity of the new plasmid 
and determination of the orientation of the pBCl 6.1 insert, with 
respect to the cry2Aa gene, was made by restriction mapping. One 
of the plasmids, pSB307.4, was transformed into Bt cryB (a cry" 
strain) by electroporation. The plasmid content of these isolates 
was verified by restriction mapping. 

Cry2Aa expressed well in Bt cryB cells transformed with 
DSB307.4, and the protein formed crystalline (rhombohedral) inclu- 
sions. The cells were harvested by centrifugation, washed with wa- 
ter, and lyophilized. Dried cell mass was added to the insect diet and 
fed to M. sexta larvae. The results confirmed that Bt cryB (pSB307.4) 
exhibited high insecticidal activity. 

Protein Expression and Purification 

The plasmid (pSB307.4) containing the Cry2Aa operon, with its ac- 
companying open reading frames, was used to transform the cry" 
strain of Bt {cryB) as previously described [39]. Cry2Aa was purified 
from the crystalline inclusions produced in the cells. Inclusions were 
harvested by cell lysis and centrifugation. Crystalline inclusions 
were washed repeatedly with 0.5 N NaCI to remove proteases and 
were transferred to buffer (10 mM Tris-HCI, 1 mM EDTA [pH 8.0]) 
with 2% mercaptoethanol. Titrating the pH to 10.5, using NH 4 OH, 
solubilized the protein from the crystalline inclusion bodies. The 
protein was purified by Sephacryl S300HR column chromatography 
as described [40] and concentrated by ultrafiltration to 10 mg ml" 1 . 

Crystallization and Structure Determination 
For recrystallization, hanging drops of the resulting concentrated 
protein (10 jxl concentrated protein buffered as described above) 
were equilibrated against wells that contained Tris buffer (1 0 mM 
Tris-HCI, 1 mM EDTA [pH 8.0]). Crystallization was induced by the 
gradual shift to neutral pH as the mobile NH 3 diffused from the 
drops. Crystals were transferred to storage buffer (50 mM PIPES, 
250 mM NaCI [pH 6.8]) with 2% mercaptoethanol. The resulting 
crystals are in spacegroup P4 3 2 1 2; unit cell constants a = 85.6 A, 
c = 163.9 A. They have one monomer in the asymmetric unit, an 
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estimated 34% solvent content, and diffract to ~3.0 A using Cu K,, 
X-rays from a rotating anode generator and to 2.0 A at a synchrotron 
source after flash freezing. 

For the collection of data at 1 00K, the crystals were transferred 
in three steps to a final 20% solution of cryo- protectant (a 1 :1 mixture 
of 1 ,2-propane diol and glycerol) and storage buffer and flash frozen 
in a cold nitrogen stream. X-ray diffraction data were collected at 
SSRL beamline 7.1 using a wavelength of 1 .08 A. Intensity data were 
integrated, scaled, and merged using HKL [41]. The overall Wilson 
B factor (3.0 A < d < 2.2 A) was 14 A 2 . 

De novo phasing was achieved using multiple isomorphous re- 
placement after attempts to find a molecular replacement solution 
to the phase problem employing the available coordinates of Cry3Aa 
and Cry1 Aa were unsuccessful. The heavy atom derivatives (Table 
1) were solved from difference Patterson maps as displayed using 
XtalView [42], Difference Fourier inspection for minor sites and re- 
finement of the heavy atom positions, occupancies, and B factors 
was completed in PHASES [43]. Trie resulting protein electron den- 
sity map was subjected to solvent-flipping density modification, as 
implemented in Solomon [44]. The helical bundle was apparent in 
5 A maps; at 3 A resolution, the correct enantiomorph was clear from 
its stereochemistry. Using CrylAa as the initial building template, 
pofyalanine versions of the helical and jellyroll domains were manu- 
ally positioned using O [45], and the fit was optimized using the real- 
space refinement package ESSENS [46]. Positional and simulated 
annealing refinement were carried out using the maximum likelihood 
target of XPLOR 3.85X [47]. 

Acknowledgments 

We thank D.H. Dean, E.A. Zhukovsky, J. Finer-Moore, and R.J. Flet- 
terick for helpful discussions during the course of this investigation. 
We thank V. Ramalingam for assistance in crystallization and data 
collection and G.K. Powell for providing us with the Cry2Aa clone. 
This work is based upon research conducted at SSRL, which is 
funded by the Department of Energy, Office of Basic Energy Sci- 
ences. This work was supported by the National Institutes of Health 
(GM-244485 to R.M.S). 

Received: December 27, 2000 
Revised: April 4, 2001 
Accepted: April 6, 2001 

References 

1. Schnepf, E., et a!., and Dean, D.H. (1998). Bacillus thuringiensis 
and its pesticidal crystal proteins. Microbiol. Mol. Biol. Rev. 62, 
775-806. 

2. Tojo, A., and Aizawa, K. (1 983). Dissolution and degradation of 
Bacillus thuringiensis 5-endotoxin by gut juice protease of the 
silkworm Bombyx mori. Appl. Environ. Microbiol 45, 576-580. 

3. Aronson, A.I., Han, E.S., McGaughey, W., and Johnson, D. 
(1991). The solubility of inclusion proteins from Bacillus thurin- 
giensis is dependent upon protoxin composition and is a factor 
in toxicity to insects. Appl. Environ. Microbiol 57, 981-986. 

4. Choma, T., Surewicz, W.R., Carey, P.R., Pozsgay, M,, Raynor, 
T., and Kaplan, H. (1990). Unusual proteolysis of the protoxin 
and toxin from Bacillus thuringiensis. Structural implications. 
Eur. J. Biochem. 189, 523-527. 

5. Hofmann, C, Vanderbruggen, H., H6fte, H., Van Rie, J., Jansens, 
S., and Van Mellaert, H. (1 988). Specificity of Bacillus thurin- 
giensis 8-endotoxins is correlated with the presence of high- 
affinity binding sites in the brush border membrane of target 
insect midguts. Proc. Nat. Acad. Sci, USA 85, 7844-7848. 

6. Wolfersburger, M.G., Hofmann, C, and Luthy, P. (1986). Interac- 
tion of Bacillus thuringiensis with membrane vesicles isolated 
from iepidopteran larval midgut Zbl. Bakt. Mikrobiol. Hyg. I. 
Suppl. 15, 237-238. 

7. Van Rie, J., Jansens, S., H6fte, H., Degheele, D., and Van Mel- 
laert, H. (1989). Specificity of Bacillus thuringiensis 8-endotox- 
ins: importance of specific receptors on the brush border mem- 
brane of the mid-gut of target insects. Eur. J. Biochem. 786, 
239-247. 



8. Slatin, S.L t Abrams, C.K., and English, L. (1990). Detta-endotox- 
ins form cation-selective channels in planar lipid bilayers. Bio- 
chem. Biophys. Res. Commun. 1 69, 765-772. 

9. English, L, Readdy, T.L., and Bastian, A.E. (1991). Delta-endo- 
toxin-induced leakage of M Rb + -rC and H2O from phospholipid 
vesicles is catalyzed by reconstituted midgut membrane. Insect 
Biochem. 27,177-184, 

1 0. Schwartz, J.L., Gameau, L, Savaria, D., Masson, L, and Brous- 
seau, R. (1 993). Lepidopteran-specific crystal toxins from Bacil- 
lus thuringiensis form cation- and anion-selective channels in 
planar lipid bilayers. J. Membrane Biol. 132, 53-62. 

11. Harvey, W.R., and Wolfersberger, M.G. (1979). Mechanism of 
inhibition of active potassium transport in isolated midgut of 
Manduca sexta by Bacillus thuringiensis endotoxin. J. Exp. Biol. 
83, 293-304. 

12. Harvey, W.R., Cioffi, M., and Wolfersberger, M.G. (1986). Trans- 
port physiology of Iepidopteran midgut in relation to the action 
of B.t. delta-endotoxin. In Fundamental and Applied Aspects of 
Invertebrate Pathology, J.M. Vlak, D. Peters, and R.A. Samson, 
eds. (Wageningen, The Netherlands: Grafisch Berdrijf Ponsen 
and Looijen), pp. 11-14. 

13. Knowles, B.H., and Ellar, D.J. (1987). Colloid-osmotic lysis is 
a general feature of the mechanism of Bacillus thuringiensis 
d-endotoxins with different insect specificity. Biochem. Bio- 
phys. Acta 924, 509-518. 

14. Wolfersberger, M.G. (1992). V-ATPase energized epithelia and 
biological insect control. J. Exp. Biol. 172, 377-386. 

15. Crickmore, N., et al., and Dean, D.H. (1998). Revision of the 
nomenclature of the Bacillus thuringiensis pesticidal crystal pro- 
teins. Microbiol. Mol. Biol. Rev. 62, 807-813. 

16. Losey, J.E., Rayor, L.S., and Carter, M.E. (1999). Transgenic 
pollen harms monarch larvae. Nature 399, 21 4. 

17. Van Rie, J., McGaughey, W.H., Johnson, D.E., Bamett, B.D., 
and Van Mellaert, H. (1 990). Mechanism of insect resistance to 
the microbial insecticide Bacillus thuringiensis. Science 247, 
72-74. 

18. McGaughey, W.H., Gould, F., and Gelertner, W. (1998). Bt resis- 
tance management. Nature Biotechnol. 76, 1 44-1 46. 

19. Yamamoto, T., and McLaughlin, R.E. (1981). Isolation of a pro- 
tein from the parasporal crystal of Bacillus thuringiensis var. 
kurstaki toxic to the mosquito larva Aedes taeniarhynehus. Bio- 
chem. Biophys. Res. Commun. 703, 41 4-421 . 

20. Donovan, W.P., Dankocsik, C.C., Gilbert, M.P., Gawron-Burke, 
M.C., Groat, R.G., and Carlton, B.C. (1988). Amino acid se- 
quence and entomocidal activity of the P2 crystal protein. An 
insect toxin from Bacillus thuringiensis var. kurstaki. J. Biol. 
Chem. 263, 561-567. 

21. Yamamoto, T., and Powell, G.K. (1993). Bacillus thuringiensis 
crystal proteins: recent advances in understanding its insectici- 
dal activity. In Advanced Engineered Pesticides. L Kim, ed. 
(New York: Marcel Dekker), pp. 3-42. 

22. Kota, M., Daniell, H., Varma, S., Garczynski, S.F., Gould, F., and 
Moar, W.J. (1999). Overexpression of the Bacillus thuringiensis 
(Bt) Cry2Aa2 protein in chloroplasts confers resistance to plants 
against susceptible and Bt-resistant insects. Proc. Nat. Acad. 
Sci. USA 96, 1840-1845. 

23. English, L, et al., and Slatin, S.L. (1994). Mode of action of 
CryllA: a Bacillus thuringiensis delta-endotoxin. Insect Bio- 
chem. Mol. Biol. 24, 1025-1035. 

24. Widner, W.R., and Whiteley, H.R. (1 990). Location of the dipteran 
specificity region in a lepidopteran-dipteran crystal protein from 
Bacillus thuringiensis. J. Bact. 772, 2826-2832. 

25. Liang, Y., and Dean, D.H. (1994). Location of a Iepidopteran 
specificity region in insecticidal crystal protein CryllA from Ba- 
cillus thuringiensis. Mol. Microbiol. 73, 569-575. 

26. Widner, W.R., and Whiteley, H.R. (1989). Two highly related 
insecticidal crystal proteins of Bacillus thuringiensis subsp. kur- 
staki possess different host range specificities. J. Bacteriol. 7 77, 
965-974. 

27. Audtho, M., Valaitis, A.P., Alzate, O., and Dean, D.H. (1999). 
Production of chymotryps in-resistant Bacillus thuringiensis 
Cry2Aa1 delta-endotoxin by protein engineering. Appl. Environ. 
Microbiol. 65, 4601-4605. 

28. U, J., Carroll, J., and Ellar, D.J. (1 991 ). Crystal structure of insec- 



Structure of Cry2Aa 
417 



ticidal 6-endotoxin from Bacillus thuringiensis at 2.5 A resolu- 
tion. Nature 353, 815-821. 

29. Grochulski, P., et al, and Cygler, M. (1 995). Bacillus thuringiensis 
CrylA{a) insecticidal toxin: crystal structure and channel forma- 
tion. J. Mol. Biol. 254, 447-^64. 

30. Lee, M.K., Young, B.A., and Dean, D.H. (1995). Domain III ex- 
changes of Bacillus thuringiensis Cry1 A toxins affect binding to 
different gypsy moth midgut receptors. Biochem. Biophys. Res. 
Commun. 216, 306-312. 

31 . de Maagd, R.A., et al., and Bosch, D. (1 996). Domain III substitu- 
tion in Bacillus thuringiensis delta-endotoxin CrylA(b) results in 
superior toxicity for Spodoptera exigua and altered membrane 
protein recognition. Appl. Environ. Microbiol. 62, 1537-1543. 

32. Schwartz, J.L, Potvin, l_, Chen, X.J., Brousseau, R., Laprade, 
R., and Dean, D.H. (1 997). Single-site mutations in the conserved 
altemating-arginine region affect ionic channels formed by 
CrylAa, a Bacillus thuringiensis toxin. Appl. Environ. Microbiol. 
63, 3978-3984. 

33. Von Tersch, M.A., Slatin, S.L., Kulesza, C.A., and English, L.H. 
(1 994). Membrane-permeabilizing activities of Bacillus thurin- 
giensis coleopteran-active toxin CrylllB2 and CrylllB2 domain 
I peptide. Appl. Environ. Microbiol. 60, 3711-3717. 

34. Foote, J. f and Winter, G. (1991), Antibody framework residues 
affecting the conformation of the hypervariable loops. J. Mol. 
Biol. 224, 487-499. 

35. Wedemayer, G.J., Patten, P.A., Wang, L.H., Schuftz, P.G., and 
Stevens, R.C. (1997). Structural insights into the evolution of an 
antibody combining site. Science 276, 1665-1669. 

36. Huang, F., Buschman, L.L., Higgins, R.A., and McGaughey, W.H. 
(1999). Inheritance of resistance to Bacillus thuringiensis toxin 
(Dipel ES) in the European com borer. Science 284, 965-967. 

37. Hilbeck, A., Moar, W.J., Pusztai -Carey, M., Fillipini, A., and 
Bigler, F. (1999). Prey-mediated effects of CrylAb toxin and 
protoxin and Cry2A protoxin on the predator Chrysoperla car- 
nea. Entomol. Exp. Appl. 91 , 305-316. 

38. Bowen D., et al., and ffrench-Constant, R.H. (1998). Insecticidal 
toxins from the bacterium Photorhabdus luminescens. Science 
280, 212&-2132. 

39. Sasaki, J., et al., and Yamamoto, T. (1996). Insecticidal activity 
of the protein encoded by the cryV gene of Bacillus thuringiensis 
kurstaki INA-02. Curr. Microbiol. 32, 195-200. 

40. Yamamoto, T. (1989). Identification of entomocidaJ toxins of 
Bacillus thuringiensis by high-performance liquid chromatogra- 
phy. ACS Symposium Series, 432, 46-60. 

41. Otwinowski, Z., and Minor, W. (1997). Processing of X-ray dif- 
fraction data collected in oscillation mode. Methods Enzymol. 
276, 307-326. 

42. McRee, D. (1 993). Practical Protein Crystallography. (San Diego, 
CA: Academic Press). 

43. Furey, W. p and Swaminathan, S. (1 997). PHASES-95: A program 
package for the processing and analysis of diffraction data from 
macromolecules. In Methods Enzymology C. Carter and R. 
Sweet, eds. (Orlando FL: Academic Press), pp. 307-326. 

44. Abrahams, J.P., and Leslie, A.G. (1996). Methods used in the 
structure determination of bovine mitochondrial F-1 Atpase. 
Acta Crystallogr. D 52, 30-42. 

45. Jones, T.A., Zou, J.Y., Cowan, S.W., and Kjeldgaard, M. (1991). 
Improved methods for building protein models in electron den- 
sity maps and the location of errors in these models. Acta Crys- 
tallogr. A 47, 110-119. 

46. Kleywegt, G.J., and Jones, TA (1997). Template convolution 
to enhance or detect structural features in macromolecular elec- 
tron-density maps. Acta Crystallogr. D 53, 179-185. 

47. BrOnger, A.T. (1993). X-PLOR Version 3.1 a System for X-ray 
Crystallography and NMR. (New Haven, CT: Yale University 
Press). 

48. Ferrin, T.E., Huang, C.C., Jarvis, L.E., and Langridge, R. (1988). 
The MIDAS display system. J. Mol. Graph. 6, 13-27, 36-37. 

49. Nicholls, A. (1992). GRASP Manual. (New York: Columbia Uni- 
versity). 

50. McDonald, I.K., and Thornton, J.M. (1994). Satisfying hydrogen 
bonding potential in proteins. J. Mol. Biol. 238, 777-793. 

51. Barton, G.J. (1993). ALSCRIPT: A tool to format multiple se- 
quence alignments. Protein Eng. 6, 37-40. 



Accession Numbers 

The coordinates and structure factors for Cry2Aa have been depos- 
ited with the Protein Data Bank (accession code 1 15P). 
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